postgresql - 双重类型的琐碎订单：性能崩溃

Question

人物：

id BIGINT
geo_point POINT (PostGIS)
stroke_when TIMESTAMPTZ（索引！）
stroke_when_second双精度

PostgeSQL 9.1，PostGIS 2.0。

1.查询：

SELECT ST_AsText(geo_point) 
FROM lightnings 
ORDER BY stroke_when DESC, stroke_when_second DESC 
LIMIT 1

总运行时间：31100.911 毫秒！

解释（ANALYZE on、VERBOSE off、COSTS on、BUFFERS on）：

Limit  (cost=169529.67..169529.67 rows=1 width=144) (actual time=31100.869..31100.869 rows=1 loops=1)
  Buffers: shared hit=3343 read=120342
  ->  Sort  (cost=169529.67..176079.48 rows=2619924 width=144) (actual time=31100.865..31100.865 rows=1 loops=1)
        Sort Key: stroke_when, stroke_when_second
        Sort Method: top-N heapsort  Memory: 17kB
        Buffers: shared hit=3343 read=120342
        ->  Seq Scan on lightnings  (cost=0.00..156430.05 rows=2619924 width=144) (actual time=1.589..29983.410 rows=2619924 loops=1)
              Buffers: shared hit=3339 read=120342

2.选择另一个字段：

SELECT id 
FROM lightnings 
ORDER BY stroke_when DESC, stroke_when_second DESC 
LIMIT 1

总运行时间：2144.057 毫秒。

解释（ANALYZE on、VERBOSE off、COSTS on、BUFFERS on）：

Limit  (cost=162979.86..162979.86 rows=1 width=24) (actual time=2144.013..2144.014 rows=1 loops=1)
  Buffers: shared hit=3513 read=120172
  ->  Sort  (cost=162979.86..169529.67 rows=2619924 width=24) (actual time=2144.011..2144.011 rows=1 loops=1)
        Sort Key: stroke_when, stroke_when_second
        Sort Method: top-N heapsort  Memory: 17kB
        Buffers: shared hit=3513 read=120172
        ->  Seq Scan on lightnings  (cost=0.00..149880.24 rows=2619924 width=24) (actual time=0.056..1464.904 rows=2619924 loops=1)
              Buffers: shared hit=3509 read=120172

3.正确优化：

SELECT id 
FROM lightnings 
ORDER BY stroke_when DESC 
LIMIT 1

总运行时间：0.044 毫秒

解释（ANALYZE on、VERBOSE off、COSTS on、BUFFERS on）：

Limit  (cost=0.00..3.52 rows=1 width=16) (actual time=0.020..0.020 rows=1 loops=1)
  Buffers: shared hit=5
  ->  Index Scan Backward using lightnings_idx on lightnings  (cost=0.00..9233232.80 rows=2619924 width=16) (actual time=0.018..0.018 rows=1 loops=1)
        Buffers: shared hit=5

正如您所看到的，尽管当 SQL 优化器使用索引时查询是一个非常原始的查询，但存在两个糟糕且非常不同的冲突：

即使优化器不使用索引，为什么使用 As_Text(geo_point) 而不是 id 需要这么多时间？结果只有一行！
当 ORDER BY 中显示未索引的字段时，无法使用一阶索引。提到在实践中，DB 中每秒只显示几行。

当然，上面是一个简化的查询，是从更复杂的结构中提取的。通常我会按日期范围选择行，应用复杂的过滤器。

score 2 · Accepted Answer

PostgreSQL 无法使用您的索引为前两个查询按所需顺序生成值。当两个或更多行具有相同store_when的相同时，它们会以任意顺序从索引扫描中返回。要确定行的正确顺序将需要二次排序。因为 PostgreSQL 执行器没有执行二级排序的工具，所以它回退到完全排序方法。

如果您经常需要按该顺序查询表，则将当前索引替换为包含两列的复合索引。

您可以将当前查询转换为仅对的最大值明确指定二级排序的形式store_when：

 SELECT ST_AsText(geo_point) FROM lightnings
 WHERE store_when = (SELECT max(store_when) FROM lightnings)
 ORDER BY stroke_when_second DESC LIMIT 1

score 1 · Accepted Answer

1

第一步可能是：在 {stroke_when, stroke_when_second} 上创建一个复合索引

于 2012-09-17T13:27:43.257 回答

postgresql - 双重类型的琐碎订单：性能崩溃

2 回答 2

Related

Reference