oracle - 如何获得 num uniques week to date 但唯一的期间滚动日期

Question

非常简化，一个包含一些示例数据的表格：

action_date account_id
1/1/2010    123
1/1/2010    123
1/1/2010    456
1/2/2010    123
1/3/2010    789

对于上面的数据，我需要一个查询，它将给出以下内容：

action_date num_events  num_unique_accounts  num_unique_accounts_wtd
1/1/2010    3           2                    2
1/2/2010    1           1                    2
1/3/2010    1           1                    3

正如您在此处看到的， num_unique_accounts_wtd 给出了唯一时期的一种滚动结束日期......

起初，人们会认为是形式的查询

WITH
    events AS
    (
        SELECT
            action_date
            , COUNT(account_id) num_events
            , COUNT(DISTINCT account_id) num_unique_accounts
        FROM     actions
        GROUP BY action_date
    )
SELECT
    action_date
    , num_events
    , num_unique_accounts
    , SUM(num_unique_accounts) OVER (PARTITION BY NEXT_DAY(action_date, 'Monday') - 7 ORDER BY action_date ASC) num_unique_accounts_wtd
FROM events

会起作用，但如果你仔细观察，它只会每天添加 num_unique_accounts。如果要在 2010 年 1 月 2 日运行查询，为了清楚起见，它会给出 num_unique_accounts_wtd = 3，因为 2 + 1。

有任何想法吗？

编辑：为清楚起见，又添加了一行数据和输出

score 0 · Accepted Answer

我会将事件查询拆分为 2：

WITH
    events1 AS
    (
        SELECT 
               NEXT_DAY(action_date, 1) - 7 week
             , action_date             
             , COUNT(account_id) num_events
             , COUNT(DISTINCT account_id) num_unique_accounts
        FROM     actions
        GROUP BY action_date
    ),
    events2 AS
    (
        SELECT NEXT_DAY(action_date, 1) - 7 week               
             , COUNT(DISTINCT account_id) num_unique_accounts_wtd
        FROM     actions
        GROUP BY NEXT_DAY(action_date, 1)
    )
SELECT events1.*, events2.num_unique_accounts_wtd
  FROM events1, events2 
 WHERE events1.week = events2.week

其中events1将选择一天不同帐户的数量，而events2将选择每周不同帐户的数量。

编辑：我现在明白了这个要求。但是，如果操作表中的行数非常高，我唯一的想法会很重：

WITH
events AS
(
    SELECT 
           NEXT_DAY(action_date, 1) - 7 week
         , action_date             
         , COUNT(account_id) num_events
         , COUNT(DISTINCT account_id) num_unique_accounts
    FROM     actions
    GROUP BY action_date 
)      
SELECT events.*, 
      (SELECT COUNT(DISTINCT(account_id)) 
         FROM actions 
        WHERE action_date < events.week + 7) as num_unique_accounts_wtd
 FROM events
ORDER BY events.action_date

如您所见，这个想法是（重新）计算事件子查询的每一行的所有不同account_id。

score 0 · Accepted Answer

似乎答案可能是能够修改分析函数以包含某种形式

COUNT(DISTINCT ...) OVER (PARTITION BY ... ORDER BY ... RANGE BETWEEN ... AND ...)

因为 RANGE BETWEEN 允许表达式，所以 PARTITION BY 窗口可以进一步子集化以获得我们正在寻找的东西——不幸的是，Oracle 给出了一个

ORA-30487 DISTINCT functions and RATIO_TO_REPORT cannot have an ORDER BY

错误，所以我们不能使用它。

在谷歌搜索错误之后，我发现其他人也在尝试相同的事情（这里和这里），并且在链接中找到了两个答案——其中一个用于我的真实数据。

作为参考，使用原始帖子中的模型对这个问题的答案将是以下形式：

SELECT    action_date, COUNT(account_id) num_attempts, MAX(num_accounts) num_unique_accounts_wtd
FROM
(
    SELECT
        action_date
        , account_id
        , SUM(is_unique) OVER (PARTITION BY NEXT_DAY(action_date, 'Monday') - 7 ORDER BY action_date ASC, account_id ASC) num_accounts
    FROM
    (
        SELECT
            action_date
            , account_id
            , CASE
                WHEN LAG(account_id) OVER (PARTITION BY NEXT_DATE(action_date, 'Monday') - 7, account_id ORDER BY action_date ASC) = account_id 
                THEN 0
                ELSE 1
            END is_unique
            FROM
                actions
    )
)
GROUP BY  action_date

所以数据是

迭代并确定对于每个帐号的一周，它是否是唯一的
然后对于每周，首先按操作日期订购集合，然后是 account_id 并创建一个运行总计
按操作日期分组并取最大周数

oracle - 如何获得 num uniques week to date 但唯一的期间滚动日期

2 回答 2

Related

Reference