嘿伙计们,我是创建群组的新手,如果我收到任何帮助(通过 python 代码或 sql 查询),我将不胜感激......让我们假设骑自行车(类似于 uber)数据集如下:
**Note:**columns relevant for the cohort creation are 'timestamp' and 'custid'.
I tried referring to gregred( http://www.gregreda.com/2015/08/23/cohort-analysis-with-python/.)
However am losing on the weekly analysis base.
timestamp custID lat1 lng1 lat2 lng2
==================================================== =================
2018-04-07 7:07:17 14626 12.3136215 76.658195 12.287301 76.60228
2018-04-07 7:32:27 85490 12.943947 77.560745 12.954014 77.54377
2018-04-07 7:36:44 5408 12.899603 77.5873 12.93478 77.56995
2018-04-07 7:38:00 58940 12.918229 77.607544 12.968971 77.636375
2018-04-07 7:39:29 5408 12.89949 77.58727 12.93478 77.56995
2018-04-07 7:43:08 5408 12.899421 77.587326 12.93478 77.56995
2018-04-07 7:43:55 50266 12.898679 77.60434 12.877949 77.5959
2018-04-07 7:52:31 58940 12.918229 77.607544 12.968971 77.636375
2018-04-07 7:52:42 58940 12.918229 77.607544 12.968971 77.636375
2018-04-07 7:53:23 28126 12.91184 77.60225 12.940866 77.54071
2018-04-07 7:55:05 99251 12.87466 77.61951 12.896871 77.60847
2018-04-07 7:55:24 99251 12.87466 77.61951 12.896871 77.60847
2018-04-07 8:00:04 34808 12.989711 77.65381 12.939158 77.73467
2018-04-07 8:00:16 34808 12.989711 77.65381 12.939158 77.73467
2018-04-07 8:03:16 89714 12.868537 77.65304 12.972006 77.59487
2018-04-07 8:03:24 89714 12.868537 77.65304 12.972006 77.59487
2018-04-07 8:07:16 82060 12.987069 77.57703 12.970017 77.577934
2018-04-07 8:08:57 18815 12.933479 77.57087 12.961353 77.57457
2018-04-07 8:11:35 38288 12.886039 77.64894 12.902692 77.62253
2018-04-07 8:17:24 80401 12.990636 77.67494 12.962719 77.5876
2018-04-07 8:20:51 89225 12.99445 77.729546 12.980993 77.69732
从上面的模拟数据集中,目标是计算该周乘车的客户的每周队列,一个示例队列输出如下所示。
Input query format: STARTING WEEK, NUMBER OF WEEKS
Input Sample: Week 1, 3 weeks
Output Sample:
week1 week2 week3
week1 100 80 90
week2 200 70
week3 100
Input query format: STARTING Date, NUMBER OF WEEKS
Input: 1/12/2019, 3 weeks
Output Sample:
1/12/2019 8/12/2019 16/12/2019
1/12/2019 100 80 90
8/12/2019 200 70
16/12/2019 100
Interpretation:
● 100 customers were active on the week of 1st.
○ Out of which 80 came back on 8th week
○ Out of which 90 came back on 16th week
● 200 new customers were active on the week of 8th
○ Out of which 70 came back on week of 16th.
是否有任何pythonic方式(通过pandas)或查询方式(通过sql)来计算上述格式的队列输出。