postgresql - Postgis ST_Intersects 查询不使用现有空间索引

Question

我有一张郊区表格，每个郊区都有一个 geom 值，在地图上代表它的多面体。还有另一个房屋表，其中每个房屋在地图上都有其点的几何值。

两个 geom 列都使用 gist 进行索引，并且郊区表也具有索引的 name 列。Suburbs 表有 8k+ 条记录，而 house 表有 300k+ 条记录。

现在我的任务是找到名为“FOO”的郊区内的所有房屋。

查询 #1：

SELECT * FROM houses WHERE ST_INTERSECTS((SELECT geom FROM "suburbs" WHERE "suburb_name" = 'FOO'), geom);

查询计划结果：

Seq Scan on houses  (cost=8.29..86327.26 rows=102365 width=136)
  Filter: st_intersects($0, geom)
  InitPlan 1 (returns $0)
    ->  Index Scan using suburbs_suburb_name on suburbs  (cost=0.28..8.29 rows=1 width=32)
          Index Cond: ((suburb_name)::text = 'FOO'::text)

运行查询耗时约 3.5 秒，返回 486 条记录。

QUERY #2：（在 ST_INTERSECTS 函数前面加上 _ 以明确要求它不要使用索引）

SELECT * FROM houses WHERE _ST_INTERSECTS((SELECT geom FROM "suburbs" WHERE "suburb_name" = 'FOO'), geom);

查询计划结果：（与查询 #1 完全相同）

Seq Scan on houses  (cost=8.29..86327.26 rows=102365 width=136)
  Filter: st_intersects($0, geom)
  InitPlan 1 (returns $0)
    ->  Index Scan using suburbs_suburb_name on suburbs  (cost=0.28..8.29 rows=1 width=32)
          Index Cond: ((suburb_name)::text = 'FOO'::text)

运行查询耗时约 1.7 秒，返回 486 条记录。

QUERY #3：（使用 && 运算符在 ST_Intersects 函数之前添加边界框重叠检查）

SELECT * FROM houses WHERE (geom && (SELECT geom FROM "suburbs" WHERE "suburb_name" = 'FOO')) AND ST_INTERSECTS((SELECT geom FROM "suburbs" WHERE "suburb_name" = 'FOO'), geom);

查询计划结果：

Bitmap Heap Scan on houses  (cost=21.11..146.81 rows=10 width=136)
  Recheck Cond: (geom && $0)
  Filter: st_intersects($1, geom)
  InitPlan 1 (returns $0)
    ->  Index Scan using suburbs_suburb_name on suburbs  (cost=0.28..8.29 rows=1 width=32)
          Index Cond: ((suburb_name)::text = 'FOO'::text)
  InitPlan 2 (returns $1)
    ->  Index Scan using suburbs_suburb_name on suburbs suburbs_1  (cost=0.28..8.29 rows=1 width=32)
          Index Cond: ((suburb_name)::text = 'FOO'::text)
  ->  Bitmap Index Scan on houses_geom_gist  (cost=0.00..4.51 rows=31 width=0)
        Index Cond: (geom && $0)

运行查询耗时 0.15s，返回 486 条记录。

显然，只有查询 #3 从空间索引中受益，从而显着提高了性能。但是，语法很丑陋，并且在某种程度上重复了自己。我的问题是：

为什么 postgis 不够聪明，无法在查询 #1 中使用空间索引？
为什么查询 #2 与查询 #1 相比具有（很多）更好的性能，考虑到它们都没有使用索引？
有什么建议可以让查询#3 更漂亮吗？或者有没有更好的方法来构造一个查询来做同样的事情？

score 3 · Accepted Answer

尝试将查询扁平化为一个查询，而不使用不必要的子查询：

SELECT houses.*
FROM houses, suburbs
WHERE suburbs.suburb_name = 'FOO' AND ST_Intersects(houses.geom, suburbs.geom);

postgresql - Postgis ST_Intersects 查询不使用现有空间索引

1 回答 1

Related

Reference