1

我在雪花中有一张桌子,时间范围从例如 2019.01 到 2020.01。一个 ID 可以在任何日期出现多次(匹配)。

例如: my_table:两列dddateid

日期 ID
2019-02-03 607
2019-01-07 356
2019-08-06 491
2019-01-01 607
2019-12-17 529
2019-04-15 356

……

有没有办法可以找到在本月至少出现一次且在前三个月也至少出现一次的 ID 总数,并按月分组以显示从 2019-04 开始的每个月的数量(表中提供前三个月数据的第一个月)直到 2020-01。

我正在考虑这样的一些代码:

WITH PREV_THREE AS (
SELECT 
  DATE_TRUNC('MONTH', dddate) AS MONTH, 
  ID AS CURR_ID
FROM my_table mt 
INNER JOIN
(
(
SELECT 
  MONTH(DATEADD(DATE_TRUNC('MONTH', dddate), -1, GETDATE())) AS PREV_MONTH, 
  ID AS PREV_3_MON_ID
FROM my_table
)
UNION ALL
(
SELECT 
  MONTH(DATEADD(DATE_TRUNC('MONTH', dddate), -2, GETDATE())) AS PREV_MONTH, 
  ID AS PREV_3_MON_ID
FROM my_table
)
UNION ALL
(
SELECT 
  MONTH(DATEADD(DATE_TRUNC('MONTH', dddate), -3, GETDATE())) AS PREV_MONTH, 
  ID AS PREV_3_MON_ID
FROM my_table 
)
) AS PREV_3_MON
ON mt.CURR_ID = PREV_3_MON.PREV_3_MON_ID
)
SELECT MONTH, COUNT(DISTINCT ID) AS COUNTER
FROM PREV_THREE
GROUP BY 1
ORDER BY 1

但是,它以某种方式返回错误并且似乎不起作用。谁能帮我解决这个问题?先感谢您!

4

1 回答 1

2

您可以使用lag()

select distinct id
from (select t.*,
             lag(dddate) over (partition by id order by dddate) as prev_dddate
      from my_table t
     ) t
where dddate >= date_trunc('MONTH', current_date) and
      prev_dddate < date_trunc('MONTH', current_date) and
      prev_dddate >= date_trunc('MONTH', current_date) - interval '3 month';

您可以这样做几个月:

select date_trunc('MONTH', dddate), count(distinct id)
from (select t.*,
             lag(dddate) over (partition by id order by dddate) as prev_dddate
      from my_table t
     ) t
where prev_dddate < date_trunc('MONTH', date_trunc('MONTH', dddate)) and
      prev_dddate >= date_trunc('MONTH', date_trunc('MONTH', dddate)) - interval '3 month'
group by date_trunc('MONTH', dddate);

即使一个id在一个月内出现多次,其中一个将是第一个,并且lag()将识别最近的上个月。

于 2021-09-05T21:26:45.213 回答