15

我有一个数据库视图,它基本上由两个SELECT查询组成UNION ALL,如下所示:

CREATE VIEW v AS
SELECT time, etc. FROM t1 // #1...
UNION ALL
SELECT time, etc. FROM t2 // #2...

问题是选择形式

SELECT ... FROM v WHERE time >= ... AND time < ...

执行起来真的很慢。

SELECT #1 和 #2 都相当快,索引正确等等:当我创建视图 v1 和 v2 时:

CREATE VIEW v1 AS
SELECT time, etc. FROM t1 // #1...

CREATE VIEW v2 AS
SELECT time, etc. FROM t2 // #2...

与上述相同的 SELECT,具有相同的 WHERE 条件,它们可以单独工作。

关于问题可能出在哪里以及如何解决它的任何想法?

(顺便提一下,它是最近的 Postgres 版本之一。)

编辑:添加匿名查询计划(感谢@filiprem 链接到一个很棒的工具):

v1:

Aggregate  (cost=9825.510..9825.520 rows=1 width=53) (actual time=59.995..59.995 rows=1 loops=1)
  ->  Index Scan using delta on echo alpha  (cost=0.000..9815.880 rows=3850 width=53) (actual time=0.039..53.418 rows=33122 loops=1)
          Index Cond: (("juliet" >= 'seven'::uniform bravo_victor oscar whiskey) AND ("juliet" <= 'november'::uniform bravo_victor oscar whiskey))
          Filter: ((NOT victor) AND ((bravo_sierra five NULL) OR ((bravo_sierra)::golf <> 'india'::golf)))

v2:

Aggregate  (cost=15.470..15.480 rows=1 width=33) (actual time=0.231..0.231 rows=1 loops=1)
  ->  Index Scan using yankee on six charlie  (cost=0.000..15.220 rows=99 width=33) (actual time=0.035..0.186 rows=140 loops=1)
          Index Cond: (("juliet" >= 'seven'::uniform bravo oscar whiskey) AND ("juliet" <= 'november'::uniform bravo oscar whiskey))
          Filter: (NOT victor)

五:

Aggregate  (cost=47181.850..47181.860 rows=1 width=0) (actual time=37317.291..37317.291 rows=1 loops=1)
  ->  Append  (cost=42.170..47132.480 rows=3949 width=97) (actual time=1.277..37304.453 rows=33262 loops=1)
        ->  Nested Loop Left Join  (cost=42.170..47052.250 rows=3850 width=99) (actual time=1.275..37288.465 rows=33122 loops=1)
              ->  Hash Left Join  (cost=42.170..9910.990 rows=3850 width=115) (actual time=1.123..117.797 rows=33122 loops=1)
                      Hash Cond: ((alpha_seven.two)::golf = (quebec_three.two)::golf)
                    ->  Index Scan using delta on echo alpha_seven  (cost=0.000..9815.880 rows=3850 width=132) (actual time=0.038..77.866 rows=33122 loops=1)
                            Index Cond: (("juliet" >= 'seven'::uniform bravo_victor oscar whiskey_two) AND ("juliet" <= 'november'::uniform bravo_victor oscar whiskey_two))
                            Filter: ((NOT victor) AND ((bravo_sierra five NULL) OR ((bravo_sierra)::golf <> 'india'::golf)))
                    ->  Hash  (cost=30.410..30.410 rows=941 width=49) (actual time=1.068..1.068 rows=941 loops=1)
                            Buckets: 1024  Batches: 1  Memory Usage: 75kB
                          ->  Seq Scan on alpha_india quebec_three  (cost=0.000..30.410 rows=941 width=49) (actual time=0.010..0.486 rows=941 loops=1)
              ->  Index Scan using mike on hotel quebec_sierra  (cost=0.000..9.630 rows=1 width=24) (actual time=1.112..1.119 rows=1 loops=33122)
                      Index Cond: ((alpha_seven.zulu)::golf = (quebec_sierra.zulu)::golf)
        ->  Subquery Scan on "*SELECT* 2"  (cost=34.080..41.730 rows=99 width=38) (actual time=1.081..1.951 rows=140 loops=1)
              ->  Merge Right Join  (cost=34.080..40.740 rows=99 width=38) (actual time=1.080..1.872 rows=140 loops=1)
                      Merge Cond: ((quebec_three.two)::golf = (charlie.two)::golf)
                    ->  Index Scan using whiskey_golf on alpha_india quebec_three  (cost=0.000..174.220 rows=941 width=49) (actual time=0.017..0.122 rows=105 loops=1)
                    ->  Sort  (cost=18.500..18.750 rows=99 width=55) (actual time=0.915..0.952 rows=140 loops=1)
                            Sort Key: charlie.two
                            Sort Method:  quicksort  Memory: 44kB
                          ->  Index Scan using yankee on six charlie  (cost=0.000..15.220 rows=99 width=55) (actual time=0.022..0.175 rows=140 loops=1)
                                  Index Cond: (("juliet" >= 'seven'::uniform bravo_victor oscar whiskey_two) AND ("juliet" <= 'november'::uniform bravo_victor oscar whiskey_two))
                                  Filter: (NOT victor)

juliettime

4

8 回答 8

10

这似乎是一个飞行员失误的案例。“v”查询计划从至少 5 个不同的表中进行选择。

现在,您确定您已连接到正确的数据库吗?也许有一些时髦的 search_path 设置?也许 t1 和 t2 实际上是视图(可能在不同的模式中)?也许您以某种方式从错误的角度进行选择?

澄清后编辑:

您正在使用一个名为“删除连接”的全新功能:http ://wiki.postgresql.org/wiki/What%27s_new_in_PostgreSQL_9.0#Join_Removal

http://rhaas.blogspot.com/2010/06/why-join-removal-is-cool.html

当涉及联合所有时,该功能似乎没有启动。您可能必须只使用所需的两个表来重写视图。

另一个编辑:您似乎正在使用聚合(例如“从 v 中选择计数(*)”与“从 v 中选择 *”),面对连接删除,这可能会得到截然不同的计划。我想如果没有您发布实际查询、视图和表定义以及使用的计划,我们不会走得太远......

于 2012-01-30T18:02:40.077 回答
6

我相信您的查询正在执行类似于:

(
   ( SELECT time, etc. FROM t1 // #1... )
   UNION ALL
   ( SELECT time, etc. FROM t2 // #2... )
)
WHERE time >= ... AND time < ...

优化器难以优化。即它UNION ALL在应用该子句之前执行第一个WHERE操作,但是您希望它在.WHEREUNION ALL

你不能把你的WHERE条款放在CREATE VIEW?

CREATE VIEW v AS
( SELECT time, etc. FROM t1  WHERE time >= ... AND time < ... )
UNION ALL
( SELECT time, etc. FROM t2  WHERE time >= ... AND time < ... )

或者,如果视图不能包含WHERE子句,那么,也许您可​​以保留两个视图并在需要时UNION ALL使用WHERE子句:

CREATE VIEW v1 AS
SELECT time, etc. FROM t1 // #1...

CREATE VIEW v2 AS
SELECT time, etc. FROM t2 // #2...

( SELECT * FROM v1 WHERE time >= ... AND time < ... )
UNION ALL
( SELECT * FROM v2 WHERE time >= ... AND time < ... )
于 2012-02-03T20:06:24.100 回答
2

我不知道 Postgres,但在索引的情况下,一些 RMDB 处理的比较运算符比 BETWEEN 差。我会尝试使用 BETWEEN。

SELECT ... FROM v WHERE time BETWEEN ... AND ...
于 2012-02-05T07:55:19.790 回答
1

合并两个表。添加一列以指示原始表。如有必要,将原始表名称替换为仅选择相关部分的视图。问题解决了!

查看超类/子类数据库设计模式可能对您有用。

于 2012-02-06T02:01:39.890 回答
1

一种可能性是在每次调用时动态发出新 SQL,而不是创建视图,并将 where 子句集成到联合查询的每个 SELECT 中

SELECT time, etc. FROM t1
    WHERE time >= ... AND time < ...
UNION ALL
SELECT time, etc. FROM t2
    WHERE time >= ... AND time < ...

编辑:

你可以使用参数化函数吗?

CREATE OR REPLACE FUNCTION CallMyView(t1 date, t2 date)
RETURNS TABLE(d date, etc.)
AS $$
    BEGIN
        RETURN QUERY
            SELECT time, etc. FROM t1
                WHERE time >= t1 AND time < t2
            UNION ALL
            SELECT time, etc. FROM t2
                WHERE time >= t1 AND time < t2;
    END;
$$ LANGUAGE plpgsql;

称呼

SELECT * FROM CallMyView(..., ...);
于 2012-01-30T18:03:29.973 回答
0

Try creating your view using UNION DISTINCT instead of UNION ALL. See if it gives wrong results. See if it gives faster performance.

If it gives wrong results, try and map your SQL operations on tables back to relational operations on relations. The elements of relations are always distinct. There may be somthing fundamentally wrong with your model.

I am deeply suspicious of the LEFT JOINS in the query plan you showed. It shouldn't be necessary to perform LEFT JOINS in order to get the results you appear to be selecting.

于 2012-02-05T08:19:49.993 回答
0

在 11g 上遇到同样的场景:

场景一:

CREATE VIEW v AS
  SELECT time, etc. FROM t1 // #1...

以下查询运行速度很快,计划看起来不错:

SELECT ... FROM v WHERE time >= ... AND time < ...

场景二:

CREATE VIEW v AS
  SELECT time, etc. FROM t2 // #2...

以下查询运行速度很快,计划看起来不错:

SELECT ... FROM v WHERE time >= ... AND time < ...

场景 3,使用 UNION ALL:

CREATE VIEW v AS
  SELECT time, etc. FROM t1 // #1...
  UNION ALL
  SELECT time, etc. FROM t2 // #2...

以下运行缓慢。计划将 t1 和 t2 (也是视图)分开并将它们组装成一个大系列的联合。时间过滤器已正确应用于各个组件,但仍然很慢:

SELECT ... FROM v WHERE time >= ... AND time < ...

我会很高兴能在 t1 加 t2 的球场上度过一段时间,但它是两倍多。在这种情况下,添加parallel提示对我有用。它将一切重新安排成一个更好的计划:

SELECT /*+ parallel */ ... FROM v WHERE time >= ... AND time < ...
于 2015-05-06T20:12:39.070 回答
-3

我认为我没有太多要点可以将其作为评论发布,因此我将其发布为答案

我不知道 PostgreSQL 在幕后是如何工作的,我想你可能会知道它是否是 Oracle,所以这里就是 Oracle 的工作方式

您的UNION ALL视图较慢,因为在幕后,来自SELECT #1#2的记录首先组合在一个临时表中,该表是动态创建的,然后是您的SELECT ... FROM v WHERE time >= 。 .. AND time < ...在这个临时表上执行。由于#1#2都已编入索引,因此它们按预期单独工作得更快,但该临时表没有编入索引(当然),并且最终记录是从该临时表中选择的,因此导致响应较慢。

现在,至少,我看不出有什么方法可以让它更快+视图+非物化

除了显式运行SELECT #1#2并 UNION 它们之外,一种方法是在应用程序编程语言中使用存储过程或函数(如果是这种情况),并在此过程中分别调用每个索引表然后合并结果,这不像SELECT ... FROM v WHERE time >= ... AND time < ... :(

于 2012-01-30T17:52:22.147 回答