我有一个我认为具有相当普遍模式的查询。考虑这张表:
id | val | ts
---+-----+-------
a | 10 | 12:01
a | 12 | 12:05
a | 9 | 12:15
b | 30 | 12:03
我想通过时间戳获取每个 id 的最新值。一些方法可以做到:
-- where in aggregate subquery
-- we avoid this because it's slow for our purposes
select
id, val
from t
where (id, ts) in
(select
id,
max(ts)
from t
group by id);
-- analytic ranking
select
id, val
from
(select
row_number() over (partition by id order by ts desc) as rank,
id,
val
from t) ranked
where rank = 1;
-- distincting analytic
-- distinct effectively dedupes the rows that end up with same values
select
distinct id, val
from
(select
id,
first_value(val) over (partition by id order by ts desc) as val
from t) ranked;
分析排名查询感觉像是最容易提出有效查询计划的查询。但在美学和维护方面,它非常难看(尤其是当表的值列不止 1 个时)。在生产中的一些地方,当测试表明性能相当时,我们使用了不同的分析查询。
有没有什么方法可以做 rank = 1 之类的事情而不会得到如此丑陋的查询?