performance - 直接查询比使用连接的子查询慢得多

Question

我有 2 张桌子。它们的结构大致如下；不过我改了名字。

CREATE TABLE overlay_polygon
(
  overlay_polygon_id SERIAL PRIMARY KEY,
  some_other_polygon_id INTEGER REFERENCES some_other_polygon (some_other_polygon_id)
  dollar_value NUMERIC,
  geom GEOMETRY(Polygon,26915)
)

CREATE TABLE point
(
  point_id SERIAL PRIMARY KEY,
  some_other_polygon_id INTEGER REFERENCES some_other_polygon (some_other_polygon_id)
  -- A bunch of other fields that this query won't touch
  geom GEOMETRY(Point,26915)
)

point在其列上具有空间索引geom，名为spix_point，并且在其some_other_polygon_id列上也具有索引。

中大约有 500,000 行point，并且几乎所有行都point与中的某些行相交overlay_polygon。最初，我的overlay_polygon表包含几行，它们的面积非常小（大部分小于 1 平方米），并且在空间上不与point. 删除不与中的任何行相交的小行后point，共有 38 行。

顾名思义，是一个多边形表，它是由其他 3 个表（包括）overlay_polygon的多边形叠加而生成的。some_other_polygon特别是，我需要使用dollar_value和一些列进行一些计算point。当我开始删除不相交的行以加快将来的处理时，我最终查询了 COUNT 行。最明显的查询似乎如下。

SELECT op.*, COUNT(point_id) AS num_points
FROM overlay_polygon op
LEFT JOIN point ON op.some_other_polygon_id = point.some_other_polygon_id AND ST_Intersects(op.geom, point.geom)
GROUP BY op.overlay_polygon_id
ORDER BY op.overlay_polygon_id
;

这是它的EXPLAIN (ANALYZE, BUFFERS).

GroupAggregate  (cost=544.45..545.12 rows=38 width=8049) (actual time=284962.944..540959.914 rows=38 loops=1)
  Buffers: shared hit=58694 read=17119, temp read=189483 written=189483
  I/O Timings: read=39171.525
  ->  Sort  (cost=544.45..544.55 rows=38 width=8049) (actual time=271754.952..534154.573 rows=415224 loops=1)
        Sort Key: op.overlay_polygon_id
        Sort Method: external merge  Disk: 897016kB
        Buffers: shared hit=58694 read=17119, temp read=189483 written=189483
        I/O Timings: read=39171.525
        ->  Nested Loop Left Join  (cost=0.00..543.46 rows=38 width=8049) (actual time=0.110..46755.284 rows=415224 loops=1)
              Buffers: shared hit=58694 read=17119
              I/O Timings: read=39171.525
              ->  Seq Scan on overlay_polygon op  (cost=0.00..11.38 rows=38 width=8045) (actual time=0.043..153.255 rows=38 loops=1)
                    Buffers: shared hit=1 read=10
                    I/O Timings: read=152.866
              ->  Index Scan using spix_point on point  (cost=0.00..13.99 rows=1 width=200) (actual time=50.229..1139.868 rows=10927 loops=38)
                    Index Cond: (op.geom && geom)
                    Filter: ((op.some_other_polygon_id = some_other_polygon_id) AND _st_intersects(op.geom, geom))
                    Rows Removed by Filter: 13353
                    Buffers: shared hit=58693 read=17109
                    I/O Timings: read=39018.660
Total runtime: 542172.156 ms

但是，我发现此查询的运行速度要快得多：

SELECT *
FROM overlay_polygon
JOIN (SELECT op.overlay_polygon_id, COUNT(point_id) AS num_points
      FROM overlay_polygon op
      LEFT JOIN point ON op.some_other_polygon_id = point.some_other_polygon_id AND ST_Intersects(op.geom, point.geom)
      GROUP BY op.overlay_polygon_id
     ) x ON x.overlay_polygon_id = overlay_polygon.overlay_polygon_id
ORDER BY overlay_polygon.overlay_polygon_id
;

它EXPLAIN (ANALYZE, BUFFERS)在下面。

Sort  (cost=557.78..557.88 rows=38 width=8057) (actual time=18904.661..18904.748 rows=38 loops=1)
  Sort Key: overlay_polygon.overlay_polygon_id
  Sort Method: quicksort  Memory: 126kB
  Buffers: shared hit=58690 read=17134
  I/O Timings: read=9924.328
  ->  Hash Join  (cost=544.88..556.78 rows=38 width=8057) (actual time=18903.697..18904.210 rows=38 loops=1)
        Hash Cond: (overlay_polygon.overlay_polygon_id = op.overlay_polygon_id)
        Buffers: shared hit=58690 read=17134
        I/O Timings: read=9924.328
        ->  Seq Scan on overlay_polygon  (cost=0.00..11.38 rows=38 width=8045) (actual time=0.127..0.411 rows=38 loops=1)
              Buffers: shared hit=2 read=9
              I/O Timings: read=0.173
        ->  Hash  (cost=544.41..544.41 rows=38 width=12) (actual time=18903.500..18903.500 rows=38 loops=1)
              Buckets: 1024  Batches: 1  Memory Usage: 2kB
              Buffers: shared hit=58688 read=17125
              I/O Timings: read=9924.154
              ->  HashAggregate  (cost=543.65..544.03 rows=38 width=8) (actual time=18903.276..18903.379 rows=38 loops=1)
                    Buffers: shared hit=58688 read=17125
                    I/O Timings: read=9924.154
                    ->  Nested Loop Left Join  (cost=0.00..543.46 rows=38 width=8) (actual time=0.052..17169.606 rows=415224 loops=1)
                          Buffers: shared hit=58688 read=17125
                          I/O Timings: read=9924.154
                          ->  Seq Scan on overlay_polygon op  (cost=0.00..11.38 rows=38 width=8038) (actual time=0.004..0.537 rows=38 loops=1)
                                Buffers: shared hit=1 read=10
                                I/O Timings: read=0.279
                          ->  Index Scan using spix_point on point  (cost=0.00..13.99 rows=1 width=200) (actual time=4.422..381.991 rows=10927 loops=38)
                                Index Cond: (op.gopm && gopm)
                                Filter: ((op.some_other_polygon_id = some_other_polygon_id) AND _st_intersects(op.geom, geom))
                                Rows Removed by Filter: 13353
                                Buffers: shared hit=58687 read=17115
                                I/O Timings: read=9923.875
Total runtime: 18905.293 ms

正如您所看到的，它们具有可比较的成本估算，尽管我不确定这些成本估算的准确性如何。我对涉及 PostGIS 功能的成本估算持怀疑态度。自上次修改和运行查询之前，这两个表都已VACUUM ANALYZE FULL在它们上运行。

也许我根本无法阅读我EXPLAIN ANALYZE的 s，但我不明白为什么这些查询的运行时间如此不同。任何人都可以识别任何东西吗？我能想到的唯一可能性与LEFT JOIN.

编辑 1

根据@ChrisTravers 的建议，我增加work_mem并重新运行了第一个查询。我不认为这代表了重大改进。

执行

SET work_mem='4MB';

（它是 1 MB。）

然后执行第一个查询给出了这些结果。

GroupAggregate  (cost=544.45..545.12 rows=38 width=8049) (actual time=339910.046..495775.478 rows=38 loops=1)
  Buffers: shared hit=58552 read=17261, temp read=112133 written=112133
  ->  Sort  (cost=544.45..544.55 rows=38 width=8049) (actual time=325391.923..491329.208 rows=415224 loops=1)
        Sort Key: op.overlay_polygon_id
        Sort Method: external merge  Disk: 896904kB
        Buffers: shared hit=58552 read=17261, temp read=112133 written=112133
        ->  Nested Loop Left Join  (cost=0.00..543.46 rows=38 width=8049) (actual time=14.698..234266.573 rows=415224 loops=1)
              Buffers: shared hit=58552 read=17261
              ->  Seq Scan on overlay_polygon op  (cost=0.00..11.38 rows=38 width=8045) (actual time=14.612..15.384 rows=38 loops=1)
                    Buffers: shared read=11
              ->  Index Scan using spix_point on point  (cost=0.00..13.99 rows=1 width=200) (actual time=95.262..5451.636 rows=10927 loops=38)
                    Index Cond: (op.geom && geom)
                    Filter: ((op.some_other_polygon_id = some_other_polygon_id) AND _st_intersects(op.geom, geom))
                    Rows Removed by Filter: 13353
                    Buffers: shared hit=58552 read=17250
Total runtime: 496936.775 ms

编辑 2

嗯，这是一种我以前没有注意到的好闻的大气味（主要是因为我在阅读ANALYZE输出时遇到了麻烦）。抱歉我没有早点注意到。

Sort  (cost=544.45..544.55 rows=38 width=8049) (actual time=271754.952..534154.573 rows=415224 loops=1)

估计行数：38。实际行数：超过 400K。想法，有人吗？

score 2 · Accepted Answer

我的直接想法是，这可能与 work_mem 限制有关。计划之间的区别在于，在第一个中，您加入然后聚合，而在第二个中，您聚合和加入。这意味着您的聚合集更窄，这意味着该操作使用的内存更少。

如果您尝试将 work_mem 加倍并再次尝试，看看会发生什么变化会很有趣。

编辑： 现在我们知道增加 work_mem 只会带来适度的改进，下一个问题是排序行估计。我怀疑它实际上在这里超过了 work_mem 并且它期望这会很容易，因为它预计只有 38 行，而是得到很多行。我不清楚规划器从哪里获得这些信息，因为很清楚规划器（正确地）估计 38 行是我们期望从聚合中得到的行数。这部分对我来说开始看起来像一个计划错误，但我很难找到它。可能值得在 pgsql-general 电子邮件列表中写出来并提出。在我看来，计划者几乎在排序所需的内存和聚合所需的内存之间感到困惑。

score 1 · Accepted Answer

正如您在中概述的那样EDIT 2，确实，返回的估计行数和实际行数之间存在很大的不匹配。但是问题的根源在树的下方，这里：

Index Scan using spix_point on point  (cost=0.00..13.99 rows=1 width=200) 
   (actual time=95.262..5451.636 rows=10927 loops=38)

这会影响树的所有节点，Nested Loop并且Sort.

我会尝试执行以下操作：

首先，确保统计数据是最新的：

VACUUM ANALYZE point;
VACUUM ANALYZE overlay_polygon;

如果不走运，请增加列的统计目标geometry：

ALTER TABLE point ALTER geom SET STATISTICS 500;
ALTER TABLE overlay_polygon ALTER geom SET STATISTICS 1500;

然后再次分析表格。

恕我直言，Nested Loops这里不好，Hash会更合适。尝试发出：
```
SET enable_nestloop TO off;
```
在会话级别上，看看它是否有帮助。

在对查询进行了更多查看之后，我认为值得提高该some_other_polygon_id列的统计目标：

ALTER TABLE point ALTER some_other_polygon_id SET STATISTICS 5000;

另外，我看不出为什么您的第二个查询比第一个查询快得多。我是否正确地说这两个查询都只执行一次并且在“冷”数据库上执行？确实感觉，第二个查询利用了 OS 文件系统缓存，因此执行得更快。

在这里使用spix_point是计划者的错误决定，因为point 将被完整扫描到 fullfil LEFT JOIN。因此，改进查询的一种方法可能是强制Seq Scan此表。这可以在以下方面的帮助下完成CTE：

WITH p AS (SELECT point_id, some_other_polygon_id, geom FROM point)
SELECT op.*, COUNT(p.point_id) AS num_points
  FROM overlay_polygon op
  LEFT JOIN p ON op.some_other_polygon_id = p.some_other_polygon_id
       AND ST_Intersects(op.geom, p.geom)
 GROUP BY op.overlay_polygon_id
 ORDER BY op.overlay_polygon_id;

但这将在物化领域放缓。不过，试一试。

performance - 直接查询比使用连接的子查询慢得多

编辑 1

编辑 2

2 回答 2

Related

Reference