1

嘿伙计们,我是创建群组的新手,如果我收到任何帮助(通过 python 代码或 sql 查询),我将不胜感激......让我们假设骑自行车(类似于 uber)数据集如下:

**Note:**columns relevant for the cohort creation are 'timestamp' and 'custid'. 
I tried referring to gregred( http://www.gregreda.com/2015/08/23/cohort-analysis-with-python/.) 
However am losing on the weekly analysis base. 


timestamp          custID     lat1      lng1        lat2        lng2

==================================================== =================

2018-04-07 7:07:17  14626   12.3136215  76.658195   12.287301   76.60228
2018-04-07 7:32:27  85490   12.943947   77.560745   12.954014   77.54377
2018-04-07 7:36:44  5408    12.899603   77.5873 12.93478    77.56995
2018-04-07 7:38:00  58940   12.918229   77.607544   12.968971   77.636375
2018-04-07 7:39:29  5408    12.89949    77.58727    12.93478    77.56995
2018-04-07 7:43:08  5408    12.899421   77.587326   12.93478    77.56995
2018-04-07 7:43:55  50266   12.898679   77.60434    12.877949   77.5959
2018-04-07 7:52:31  58940   12.918229   77.607544   12.968971   77.636375
2018-04-07 7:52:42  58940   12.918229   77.607544   12.968971   77.636375
2018-04-07 7:53:23  28126   12.91184    77.60225    12.940866   77.54071
2018-04-07 7:55:05  99251   12.87466    77.61951    12.896871   77.60847
2018-04-07 7:55:24  99251   12.87466    77.61951    12.896871   77.60847
2018-04-07 8:00:04  34808   12.989711   77.65381    12.939158   77.73467
2018-04-07 8:00:16  34808   12.989711   77.65381    12.939158   77.73467
2018-04-07 8:03:16  89714   12.868537   77.65304    12.972006   77.59487
2018-04-07 8:03:24  89714   12.868537   77.65304    12.972006   77.59487
2018-04-07 8:07:16  82060   12.987069   77.57703    12.970017   77.577934
2018-04-07 8:08:57  18815   12.933479   77.57087    12.961353   77.57457
2018-04-07 8:11:35  38288   12.886039   77.64894    12.902692   77.62253
2018-04-07 8:17:24  80401   12.990636   77.67494    12.962719   77.5876
2018-04-07 8:20:51  89225   12.99445    77.729546   12.980993   77.69732

从上面的模拟数据集中,目标是计算该周乘车的客户的每周队列,一个示例队列输出如下所示。


Input query format: STARTING WEEK, NUMBER OF WEEKS 
Input Sample: Week 1, 3 weeks
Output Sample:

         week1   week2   week3

 week1    100     80      90
 week2            200     70
 week3                    100

Input query format: STARTING Date, NUMBER OF WEEKS
Input: 1/12/2019, 3 weeks
Output Sample:

             1/12/2019   8/12/2019   16/12/2019
1/12/2019    100           80           90
8/12/2019                  200          70
16/12/2019                              100


Interpretation:
● 100 customers were active on the week of 1st.
○ Out of which 80 came back on 8th week
○ Out of which 90 came back on 16th week
● 200 new customers were active on the week of 8th
○ Out of which 70 came back on week of 16th.

是否有任何pythonic方式(通过pandas)或查询方式(通过sql)来计算上述格式的队列输出。

4

0 回答 0