25

是否可以将多个窗口功能应用于同一个分区?(如果我没有使用正确的词汇,请纠正我)

例如你可以做

SELECT name, first_value() over (partition by name order by date) from table1

但是有没有办法做类似的事情:

SELECT name, (first_value() as f, last_value() as l (partition by name order by date)) from table1

我们在哪里将两个函数应用到同一个窗口?

参考: http: //postgresql.ro/docs/8.4/static/tutorial-window.html

4

2 回答 2

32

你能不能只使用每个选择的窗口

就像是

SELECT  name, 
        first_value() OVER (partition by name order by date) as f, 
        last_value() OVER (partition by name order by date) as l 
from table1

同样从您的参考资料中,您可以这样做

SELECT sum(salary) OVER w, avg(salary) OVER w
FROM empsalary
WINDOW w AS (PARTITION BY depname ORDER BY salary DESC)
于 2009-12-13T10:29:01.733 回答
18

警告:我不会删除此答案,因为它在技术上似乎是正确的,因此可能会有所帮助,但请注意这可能PARTITION BY bar ORDER BY foo不是您想要做的。实际上,聚合函数不会将分区元素作为一个整体进行计算。也就是说,SELECT avg(foo) OVER (PARTITION BY bar ORDER BY foo) 不等于SELECT avg(foo) OVER (PARTITION BY bar)见答案末尾的证明)。

虽然它本身并没有提高性能,但如果您多次使用同一个分区,您可能希望使用 asstander 提出的第二种语法,这不仅是因为它更便宜。这就是为什么。

考虑以下查询:

SELECT 
  array_agg(foo)
    OVER (PARTITION BY bar ORDER BY foo), 
  avg(baz)
    OVER (PARTITION BY bar ORDER BY foo) 
FROM 
  foobar;

由于原则上排序对平均值的计算没有影响,您可能会想改用以下查询(在第二个分区上没有排序):

SELECT 
  array_agg(foo) 
    OVER (PARTITION BY bar ORDER BY foo), 
  avg(baz)
    OVER (PARTITION BY bar) 
FROM 
  foobar;

这是一个很大的错误,因为它需要更长的时间。证明 :

> EXPLAIN ANALYZE SELECT array_agg(foo) OVER (PARTITION BY bar ORDER BY foo), avg(baz) OVER (PARTITION BY bar ORDER BY foo) FROM foobar;
                                                           QUERY PLAN                                                        
---------------------------------------------------------------------------------------------------------------------------------
 WindowAgg  (cost=215781.92..254591.76 rows=1724882 width=12) (actual time=969.659..2353.865 rows=1724882 loops=1)
   ->  Sort  (cost=215781.92..220094.12 rows=1724882 width=12) (actual time=969.640..1083.039 rows=1724882 loops=1)
         Sort Key: bar, foo
         Sort Method: quicksort  Memory: 130006kB
         ->  Seq Scan on foobar  (cost=0.00..37100.82 rows=1724882 width=12) (actual time=0.027..393.815 rows=1724882 loops=1)
 Total runtime: 2458.969 ms
(6 lignes)

> EXPLAIN ANALYZE SELECT array_agg(foo) OVER (PARTITION BY bar ORDER BY foo), avg(baz) OVER (PARTITION BY bar) FROM foobar;
                                                              QUERY PLAN                                                           
---------------------------------------------------------------------------------------------------------------------------------------
 WindowAgg  (cost=215781.92..276152.79 rows=1724882 width=12) (actual time=938.733..2958.811 rows=1724882 loops=1)
   ->  WindowAgg  (cost=215781.92..250279.56 rows=1724882 width=12) (actual time=938.699..2033.172 rows=1724882 loops=1)
         ->  Sort  (cost=215781.92..220094.12 rows=1724882 width=12) (actual time=938.683..1062.568 rows=1724882 loops=1)
               Sort Key: bar, foo
               Sort Method: quicksort  Memory: 130006kB
               ->  Seq Scan on foobar  (cost=0.00..37100.82 rows=1724882 width=12) (actual time=0.028..377.299 rows=1724882 loops=1)
 Total runtime: 3060.041 ms
(7 lignes)

现在,如果您知道这个问题,当然您将在任何地方使用相同的分区。但是当你有十次或更多相同的分区并且你在几天内更新它时,很容易忘记在ORDER BY不需要它的分区上添加子句。

语法来了WINDOW,它可以防止你犯这种粗心的错误(当然,前提是你知道最好尽量减少不同窗口函数的数量)。EXPLAIN ANALYZE以下内容与第一个查询严格等效(据我所知):

SELECT
  array_agg(foo)
    OVER qux,
  avg(baz)
    OVER qux
FROM
  foobar
WINDOW
  qux AS (PARTITION BY bar ORDER BY bar)

预警更新:

我理解“SELECT avg(foo) OVER (PARTITION BY bar ORDER BY foo) 不等于”的说法SELECT avg(foo) OVER (PARTITION BY bar)似乎有问题,所以这里有一个例子:

# SELECT * FROM foobar;
 foo | bar 
-----+-----
   1 |   1
   2 |   2
   3 |   1
   4 |   2
(4 lines)

# SELECT array_agg(foo) OVER qux, avg(foo) OVER qux FROM foobar WINDOW qux AS (PARTITION BY bar);
 array_agg | avg 
-----------+-----
 {1,3}     |   2
 {1,3}     |   2
 {2,4}     |   3
 {2,4}     |   3
 (4 lines)

# SELECT array_agg(foo) OVER qux, avg(foo) OVER qux FROM foobar WINDOW qux AS (PARTITION BY bar ORDER BY foo);
 array_agg | avg 
-----------+-----
 {1}       |   1
 {1,3}     |   2
 {2}       |   2
 {2,4}     |   3
(4 lines)
于 2014-10-01T12:13:23.640 回答