postgresql - Postgres 9.4：如何在运行速度慢 10 倍的任何 ARRAY 查找中修复查询计划者对哈希联接的选择

Question

我当然意识到找出这些问题可能很复杂并且需要大量信息，但我希望对于这种特殊情况有一个已知问题或解决方法。我已经缩小了导致次优查询计划的查询更改（这是运行 Postgres 9.4）。

以下查询运行大约 50 毫秒。该tag_device表是一个包含约 200 万个条目的联结表，该devices表有大约 150 万个条目，而标签表有大约 500,000 个条目（注意：实际 IP 值只是虚构的）。

WITH inner_query AS (
  SELECT * FROM tag_device
  INNER JOIN tags  ON tag_device.tag_id = tags.id
  INNER JOIN devices ON tag_device.device_id = devices.id
  WHERE devices.device_ip <<= ANY(ARRAY[
    '10.0.0.1', '10.0.0.2', '10.0.0.5', '11.1.1.1', '12.2.2.35','13.0.0.1', '15.0.0.8', '1.160.0.1', '17.1.1.24', '18.2.2.1',
    '10.0.0.6', '10.0.0.21', '10.0.0.52', '11.1.1.2', '12.2.2.34','13.0.0.2', '15.0.0.7', '1.160.0.2', '17.1.1.23', '18.2.2.2',
    '10.0.0.7', '10.0.0.22', '10.0.0.53', '11.1.1.3', '12.2.2.33','13.0.0.3', '15.0.0.6', '1.160.0.3', '17.1.1.22', '18.2.2.3'
    ]::iprange[])
 ))
 SELECT * FROM inner_query LIMIT 100 OFFSET 0;

有几点需要注意。 device_ip正在使用 ip4r 模块 ( https://github.com/RhodiumToad/ip4r ) 提供 ip 范围查找，并且此列上有一个 gist 索引。上述查询使用以下查询计划在大约 50 毫秒内运行：

Limit  (cost=140367.19..140369.19 rows=100 width=239)
  CTE inner_query
    ->  Nested Loop  (cost=40147.63..140367.19 rows=56193 width=431)
          ->  Merge Join  (cost=40147.20..113345.15 rows=56193 width=261)
                Merge Cond: (tag_device.device_id = devices.id)
                ->  Index Scan using tag_device_device_id_idx on tag_device  (cost=0.43..67481.36 rows=1900408 width=51)
                ->  Materialize  (cost=40136.82..40402.96 rows=53228 width=210)
                      ->  Sort  (cost=40136.82..40269.89 rows=53228 width=210)
                            Sort Key: devices.id
                            ->  Bitmap Heap Scan on devices  (cost=1489.12..30498.45 rows=53228 width=210)
                                  Recheck Cond: (device_ip <<= ANY ('{10.0.0.1,10.0.0.2,10.0.0.5,11.1.1.1,12.2.2.2,13.0.0.1,15.0.0.2,1.160.0.5,17.1.1.1,18.2.2.2,10.0.0.1,10.0.0.2,10.0.0.5,11.1.1.1,12.2.2.2,13.0.0.1,15.0.0.2,1.160.0.5,17.1.1.1,18.2.2.2 (...)
                                  ->  Bitmap Index Scan on devices_iprange_idx  (cost=0.00..1475.81 rows=53228 width=0)
                                        Index Cond: (device_ip <<= ANY ('{10.0.0.1,10.0.0.2,10.0.0.5,11.1.1.1,12.2.2.2,13.0.0.1,15.0.0.2,1.160.0.5,17.1.1.1,18.2.2.2,10.0.0.1,10.0.0.2,10.0.0.5,11.1.1.1,12.2.2.2,13.0.0.1,15.0.0.2,1.160.0.5,17.1.1.1,18.2 (...)
          ->  Index Scan using tags_id_pkey on tags  (cost=0.42..0.47 rows=1 width=170)
                Index Cond: (id = tag_device.tag_id)
  ->  CTE Scan on inner_query  (cost=0.00..1123.86 rows=56193 width=239)

如果我增加正在查找的 ARRAY 中的 IP 地址数量，则查询计划会发生变化并变得非常慢。因此，在查询的快速版本中，数组中有 30 个项目。如果我将其增加到数组中的 80 个项目，则查询计划会发生变化并变得显着变慢（超过 10 倍）查询在所有其他方面保持不变。新的查询计划使用散列连接而不是合并连接和嵌套循环。当数组中有 80 个项目而不是 30 个项目时，这是新的（慢得多）查询计划。

Limit  (cost=204482.39..204484.39 rows=100 width=239)
  CTE inner_query
    ->  Hash Join  (cost=85839.13..204482.39 rows=146180 width=431)
          Hash Cond: (tag_device.tag_id = tags.id)
          ->  Hash Join  (cost=51368.40..145023.34 rows=146180 width=261)
                Hash Cond: (tag_device.device_id = devices.id)
                ->  Seq Scan on tag_device  (cost=0.00..36765.08 rows=1900408 width=51)
                ->  Hash  (cost=45580.57..45580.57 rows=138466 width=210)
                      ->  Bitmap Heap Scan on devices  (cost=3868.31..45580.57 rows=138466 width=210)
                            Recheck Cond: (device_ip <<= ANY ('{10.0.0.1,10.0.0.2,10.0.0.5,11.1.1.1,12.2.2.35,13.0.0.1,15.0.0.8,1.160.0.1,17.1.1.24,18.2.2.1,10.0.0.6,10.0.0.21,10.0.0.52,11.1.1.2,12.2.2.34,13.0.0.2,15.0.0.7,1.160.0.2,17.1.1.23,18.2.2.2 (...)
                            ->  Bitmap Index Scan on devices_iprange_idx  (cost=0.00..3833.70 rows=138466 width=0)
                                  Index Cond: (device_ip <<= ANY ('{10.0.0.1,10.0.0.2,10.0.0.5,11.1.1.1,12.2.2.35,13.0.0.1,15.0.0.8,1.160.0.1,17.1.1.24,18.2.2.1,10.0.0.6,10.0.0.21,10.0.0.52,11.1.1.2,12.2.2.34,13.0.0.2,15.0.0.7,1.160.0.2,17.1.1.23,18.2 (...)
          ->  Hash  (cost=16928.88..16928.88 rows=475188 width=170)
                ->  Seq Scan on tags  (cost=0.00..16928.88 rows=475188 width=170)
  ->  CTE Scan on inner_query  (cost=0.00..2923.60 rows=146180 width=239)

上面带有默认查询计划的查询运行时间约为 500 毫秒（慢了 10 倍以上）。如果我关闭哈希连接，SET enable_hashjoin= OFF;则查询计划将返回使用合并连接，并在大约 50 毫秒内再次运行，数组中有 80 个项目。

同样，这里唯一的变化是正在查找的 ARRAY 中的项目数。

有没有人对为什么规划者做出错误的选择导致大幅放缓有任何想法？

该数据库完全适合内存并位于 SSD 上。

我还想指出我正在使用 CTE，因为我遇到了一个问题，即当我在查询中添加限制时，规划器不会使用 tag_device 表上的索引。基本上这里描述的问题：http: //thebuild.com/blog/2014/11/18/when-limit-attacks/。

谢谢！

score 0 · Accepted Answer

我看到有一个排序作为合并连接的一部分。一旦超过某个阈值，执行合并连接所需的排序操作被认为过于昂贵，而散列连接估计更便宜。以这种方式运行查询可能更昂贵（时间方面）但在 CPU 消耗方面更便宜。

postgresql - Postgres 9.4：如何在运行速度慢 10 倍的任何 ARRAY 查找中修复查询计划者对哈希联接的选择

1 回答 1

Related

Reference