sql - SQL 保留队列分析

Question

我正在尝试编写每月保留率查询，以计算从初始开始月份返回并继续前进的用户百分比。

TABLE: customer_order
fields
id
date
store_id

TABLE: customer
id
person_id
job_id
first_time (bool)

这让我得到了基于第一次约会的最初的每月群组

SELECT first_job_month, COUNT( DISTINCT person_id) user_counts
FROM 
   ( SELECT DATE_TRUNC(MIN(CAST(date AS DATE)), month) first_job_month, person_id
FROM customer_order cd
INNER JOIN consumer co ON co.job_id = cd.id
GROUP BY 2
ORDER BY 1 ) first_d GROUP BY 1 ORDER BY 1

first_job_month   user_counts
2018-04-01        36

2018-05-01        37

2018-06-01        39

2018-07-01        45

2018-08-01        38

我已经尝试了很多东西，但我不知道如何从第一个月开始跟踪原始群组/用户

score 5 · Accepted Answer

为每位客户获得第一个订单月
将订单连接到上一个子查询以找出给定订单和第一个订单之间的月差是多少
使用条件聚合计算仍按 X 个月订购的客户

有一些替代选项，例如使用窗口函数在同一个子查询中执行 (1) 和 (2)，但最简单的选项是这个：

WITH
cohorts as (
    SELECT person_id, DATE_TRUNC(MIN(CAST(date AS DATE)), month) as first_job_month
    FROM customer_order cd
    JOIN consumer co 
    ON co.job_id = cd.id
    GROUP BY 1
)
,orders as (
    SELECT
     *
    ,round(1.0*(DATE_TRUNC(MIN(CAST(cd.date AS DATE))-c.first_job_month)/30) as months_since_first_order
    FROM cohorts c
    JOIN customer_order cd
    USING (person_id)
)
SELECT
 first_job_month as cohort
,count(distinct person_id) as size
,count(distinct case when months_since_first_order>=1 then person_id end) as m1
,count(distinct case when months_since_first_order>=2 then person_id end) as m2
,count(distinct case when months_since_first_order>=3 then person_id end) as m3
-- hardcode up to the number of months you want and the history you have
FROM orders 
GROUP BY 1
ORDER BY 1

看，您可以CASE在聚合函数中使用语句，例如COUNT识别要在同一组中聚合的不同行子集。这是 SQL 中最重要的 BI 技术之一。

请注意，在条件聚合中使用>=not =，例如，如果客户在m3之后购买m1并且不购买，m2他们仍将被计入m2. 如果您希望您的客户每月购买和/或查看每个月的实际留存率，并且如果后续月份的值可能高于之前的值，您可以使用=。

此外，如果您不想要从该查询中获得的“三角形”视图，或者您不想硬编码“mX”部分，您只需按first_job_month和months_since_first_order计数不同。一些可视化工具可能会使用这种简单的格式并从中制作三角形视图。

sql - SQL 保留队列分析

1 回答 1

Related

Reference