简单的方法是通过@jpw 演示的 CROSS JOIN 来解决这个问题。但是,也有一些隐藏的问题:
无条件的性能CROSS JOIN
随着行数的增加而迅速恶化。在可以在聚合中处理这个巨大的派生表之前,总行数乘以您正在测试的周数。索引无济于事。
从 1 月 1 日开始的几周会导致不一致。ISO 周可能是另一种选择。见下文。
以下所有查询都大量exam_date
使用. 一定要有一个。
仅加入相关行
应该更快:
SELECT d.day, d.thisyr
, count(t.exam_date) AS lastyr
FROM (
SELECT d.day::date, (d.day - '1 year'::interval)::date AS day0 -- for 2nd join
, count(t.exam_date) AS thisyr
FROM generate_series('2013-01-01'::date
, '2013-01-31'::date -- last week overlaps with Feb.
, '7 days'::interval) d(day) -- returns timestamp
LEFT JOIN tbl t ON t.exam_date >= d.day::date
AND t.exam_date < d.day::date + 7
GROUP BY d.day
) d
LEFT JOIN tbl t ON t.exam_date >= d.day0 -- repeat with last year
AND t.exam_date < d.day0 + 7
GROUP BY d.day, d.thisyr
ORDER BY d.day;
这是从 1 月 1 日开始的几周,就像你原来的那样。正如所评论的那样,这会产生一些不一致的情况:每周从不同的一天开始,并且由于我们在年底切断,一年中的最后一周只有 1 或 2 天(闰年)。
ISO周也一样
根据要求,请考虑ISO 周,它从星期一开始,始终跨越 7 天。但他们跨越了岁月的边界。每个文档EXTRACT()
:
星期
一天中的星期数。根据定义 (ISO 8601),星期从星期一开始,一年的第一周包含该年的 1 月 4 日。换句话说,一年中的第一个星期四是在该年的第一周。
在 ISO 定义中,1 月初的日期可能是上一年第 52 或 53 周的一部分,而 12 月下旬的日期可能是明年第一周的一部分。例如,2005-01-01
是 2004 年第 53 周的
2006-01-01
一部分,并且是 2005 年第 52 周的2012-12-31
一部分,而是 2013 年第一周的一部分。建议与该isoyear
字段一起使用week
以获得一致的结果。
上面的查询用 ISO 周重写:
SELECT w AS isoweek
, day::text AS thisyr_monday, thisyr_ct
, day0::text AS lastyr_monday, count(t.exam_date) AS lastyr_ct
FROM (
SELECT w, day
, date_trunc('week', '2012-01-04'::date)::date + 7 * w AS day0
, count(t.exam_date) AS thisyr_ct
FROM (
SELECT w
, date_trunc('week', '2013-01-04'::date)::date + 7 * w AS day
FROM generate_series(0, 4) w
) d
LEFT JOIN tbl t ON t.exam_date >= d.day
AND t.exam_date < d.day + 7
GROUP BY d.w, d.day
) d
LEFT JOIN tbl t ON t.exam_date >= d.day0 -- repeat with last year
AND t.exam_date < d.day0 + 7
GROUP BY d.w, d.day, d.day0, d.thisyr_ct
ORDER BY d.w, d.day;
1 月 4 日始终是一年中的第一个 ISO 周。所以这个表达式获取给定年份的第一个 ISO 周的星期一的日期:
date_trunc('week', '2012-01-04'::date)::date
由于 ISO 周数与 返回的周数一致EXTRACT()
,我们可以简化查询。首先,一个简短的形式:
SELECT w AS isoweek
, COALESCE(thisyr_ct, 0) AS thisyr_ct
, COALESCE(lastyr_ct, 0) AS lastyr_ct
FROM generate_series(1, 5) w
LEFT JOIN (
SELECT EXTRACT(week FROM exam_date)::int AS w, count(*) AS thisyr_ct
FROM tbl
WHERE EXTRACT(isoyear FROM exam_date)::int = 2013
GROUP BY 1
) t13 USING (w)
LEFT JOIN (
SELECT EXTRACT(week FROM exam_date)::int AS w, count(*) AS lastyr_ct
FROM tbl
WHERE EXTRACT(isoyear FROM exam_date)::int = 2012
GROUP BY 1
) t12 USING (w);
优化查询
相同的更多细节并针对性能进行了优化
WITH params AS ( -- enter parameters here, once
SELECT date_trunc('week', '2012-01-04'::date)::date AS last_start
, date_trunc('week', '2013-01-04'::date)::date AS this_start
, date_trunc('week', '2014-01-04'::date)::date AS next_start
, 1 AS week_1
, 5 AS week_n -- show weeks 1 - 5
)
SELECT w.w AS isoweek
, p.this_start + 7 * (w - 1) AS thisyr_monday
, COALESCE(t13.ct, 0) AS thisyr_ct
, p.last_start + 7 * (w - 1) AS lastyr_monday
, COALESCE(t12.ct, 0) AS lastyr_ct
FROM params p
, generate_series(p.week_1, p.week_n) w(w)
LEFT JOIN (
SELECT EXTRACT(week FROM t.exam_date)::int AS w, count(*) AS ct
FROM tbl t, params p
WHERE t.exam_date >= p.this_start -- only relevant dates
AND t.exam_date < p.this_start + 7 * (p.week_n - p.week_1 + 1)::int
-- AND t.exam_date < p.next_start -- don't cross over into next year
GROUP BY 1
) t13 USING (w)
LEFT JOIN ( -- same for last year
SELECT EXTRACT(week FROM t.exam_date)::int AS w, count(*) AS ct
FROM tbl t, params p
WHERE t.exam_date >= p.last_start
AND t.exam_date < p.last_start + 7 * (p.week_n - p.week_1 + 1)::int
-- AND t.exam_date < p.this_start
GROUP BY 1
) t12 USING (w);
借助索引支持,这应该非常快,并且可以轻松适应选择的间隔。最后一个查询中的隐式JOIN LATERAL
for需要Postgres 9.3。generate_series()
SQL小提琴。