1

所以,我想知道如何用条件划分一列。我的想法是研究用户的活动,但为此我需要设置一个条件。我有数据框:

df = pd.DataFrame({'User': ["juan","juan","juan","juan","petter","petter","petter","petter","petter","petter","petter","petter","ana","ana","ana","ana","raul","raul","raul","raul"],
               'time': ["2/1/2019","3/1/2019","4/1/2019","6/1/2019","2/1/2019","5/1/2019","6/1/2019","10/1/2019","11/1/2019","12/1/2019","13/1/2019","14/1/2019","8/1/2019","10/1/2019","15/1/2019","20/1/2019","15/1/2019","17/1/2019","18/1/2019","19/1/2019"],
                'activity': ["fly", "hotel","car","jump","fly", "hotel","jump","car","fly", "car","hotel","car","car", "hotl","car","hotel","fly", "hotel","car","car"],
              '%timeper_user': ["4 days","4 days","4 days","4 days","8 days","8 days","8 days","8 days","3 days","3 days","3 days","3 days","12 days","12 days","12 days","12 days","4 days","4 days","4 days","4 days"]})

正如您将看到的,每个用户都有一个列(时间)和另一个列(%timeper_user)。然后是一个列(活动),它是每个用户在一段时间内执行的活动。这个想法是对不同列中的每个活动进行“条件拆分”。第一幕,第二幕,第三幕,第三幕。但是当用户在时间之外(time +% timeper_user)执行活动时,将活动放在不同的列中,例如:Act21、Act 22、Act 23、Act24 我希望它是这样的:

df2 = pd.DataFrame({'User': ["juan","petter","ana","raul"],
              "act1":["fly","fly","car","fly"],
              "act2":["hotel","hotel","hotel","hotel"],
              "act3":["car","jump","car","car"],
              "act4":["jump","car","hotel","car"],
              "actn":["","","",""],
              "act21":["","fly","",""],
              "act22":["","car","",""],
              "act23":["","hotel","",""],
              "act24":["","car","",""]})

(DF2) 是我想要的输出查看用户 Petter 超过时间 (2/1/2019 + 8 天) = 10/1/2019。因此,从 2019 年 11 月 1 日起,活动将放在 Act21、Act22、Act23、Act24 中。我有很多用户,所以我不知道如何执行一个执行此操作并获取所有内容的功能(逐个用户)。如果你能帮助我,我将不胜感激。谢谢

4

1 回答 1

0

这个想法是。如果用户在范围(每个用户的时间+%时间)之间进行事件,则意味着所有活动都属于进入活动范围 1(act1、Act12、Act13、Act 14)。如果日期较大,则表示用户将输入活动 2(act21、act22、act23、act24)。简单来说......如果petter从美国去马德里,他可能会去酒店,租一辆车,然后尝试飞行。但是当他回到美国时,Petter 可能会在那里购买第二次航班(这将进入范围活动 2(第 21 幕,第 22 幕,第 23 幕,第 24 幕)。如果你运行 df... 是我拥有的数据框。 . 和 df2 是我想要制作的数据

于 2018-12-09T19:44:21.210 回答