sql - 通过聚合的聚合过滤时避免重复计算？

Question

我正试图拉动在过去一年中至少有 6 个月每月销售额超过 10,000 件的商店的月销售额。我的源销售表是每天的。因此，我正在计算所有商店所有月份的销售额，然后找出哪些商店超过 10,000 个单位 6 次，并使用该商店列表作为查询的过滤器，在该查询中我计算过滤商店的所有月份。

因此，我基本上在同sum(units_sold)一个查询中进行了两次相同的聚合计算：

select
    store_location,
    sales_date - extract(day from sales_date) + 1 as sales_month,
    sum(units_sold) as monthly_sales,   /* I already calculated this!  How to re-use? */
    case when sum(units_sold) > 10000 then 1 else 0 end as exceeded_10000 

from
    daily_sales

where
    sales_date between '2012-01-01' and '2012-12-31' and
    store_location in (
        select
            store_location
        from (
                select
                    store_location,
                    sales_date - extract(day from sales_date) + 1 as sales_month,
                    case when sum(units_sold) > 10000 then 1 else 0 end as exceeded_10000   /* evaluated per month, per store */
                from
                    daily_sales
                where
                    sales_date between '2012-01-01' and '2012-12-31'
                group by
                    store_location,
                    sales_date - extract(day from sales_date) + 1
        ) a
        group by
            store_location
        having
            sum(exceeded_10000) > 6   /* which stores had 6 months over 10000 ? */ 
    )

group by
    store_location,
    sales_date - extract(day from sales_date) + 1

这似乎效率低下——我已经sum(units_sold)在内部（过滤）查询中按月计算，但我想不出一种方法来重新使用这些月度总数。您会注意到，查询 b 必然不会按月分组，因为它会将销售额超过 10,000 的月数相加——这是我在标题中提到的聚合的总和。

Teradata 不支持 PIVOT 功能，我不希望使用大量的 CASE WHEN 来模拟枢轴，然后每月检查单行。

有没有办法让这个查询更有效率？要重新使用我已经在内部查询中计算的每月销售总额？它可以在不同的 RDBMS 平台而不是 Teradata 上进行简化吗？谢谢你。

score 3 · Accepted Answer

您的查询似乎不符合您的要求。外部查询按月分组，这不是必需的。

以下查询返回 6 个月以上销售额超过 10,000 的位置：

select store_location, sum(month_units_sold) as totalunits,
        sum(exceeded_10000) as months_over_10000
from (select store_location,
             sales_date - extract(day from sales_date) + 1 as sales_month,
             sum(units_sold) as month_units_sold
             (case when sum(units_sold) > 10000 then 1 else 0 end) as exceeded_10000   /* evaluated per month, per store */
      from daily_sales
      where sales_date between '2012-01-01' and '2012-12-31'
      group by store_location, sales_date - extract(day from sales_date) + 1
    ) t
group by store_location
having sum(exceeded_10000) >= 6

如果您想按月了解每个商店的信息，请使用窗口函数计算超出的月数：

select *
from (select t.*,
             SUM(exceeded_10000) over (partition by store_location) s MonthsExceeded10000
      from (select store_location,
                   sales_date - extract(day from sales_date) + 1 as sales_month,
                   sum(units_sold) as month_units_sold
                   (case when sum(units_sold) > 10000 then 1 else 0 end) as exceeded_10000   /* evaluated per month, per store */
            from daily_sales
            where sales_date between '2012-01-01' and '2012-12-31'
            group by store_location, sales_date - extract(day from sales_date) + 1
          ) t
    ) t
where MonthsExceeded10000

sql - 通过聚合的聚合过滤时避免重复计算？

1 回答 1

Related

Reference