sql - Postgres 查询花费的时间比预期的要长，即使在字段上有索引

Question

我正在优化存储来自日志文件的信息的 Postgres 表。

这是查询：

SELECT c_ip as ip
     , x_ctx as file_name
     , date_time
     , live
     , c_user_agent as user_agent 
FROM events 
WHERE x_event = 'play' 
  AND date = '2012-12-01' 
  AND username = 'testing'

x_event、日期和用户名上有 b 树索引。在这个表中，大约有 2500 万行。现在查询大约需要 20-25（更正，更像是 40）秒，并返回 143,000 行。

那个时间是预期的吗？由于索引，我会认为它会更快。也许是因为它必须通过大量的数据？

编辑：这是解释分析：

Bitmap Heap Scan on events  (cost=251347.32..373829.74 rows=35190 width=56) (actual time=5768.409..6124.313 rows=143061 loops=1)
  Recheck Cond: ((date = '2012-12-01'::date) AND (username = 'testing'::text) AND (x_event = 'play'::text))
  ->  BitmapAnd  (cost=251347.32..251347.32 rows=35190 width=0) (actual time=5762.083..5762.083 rows=0 loops=1)
        ->  Bitmap Index Scan on index_events_fresh_date  (cost=0.00..10247.04 rows=554137 width=0) (actual time=57.568..57.568 rows=572221 loops=1)
              Index Cond: (date = '2012-12-01'::date)
        ->  Bitmap Index Scan on index_events_fresh_username  (cost=0.00..116960.55 rows=6328206 width=0) (actual time=3184.053..3184.053 rows=6245831 loops=1)
              Index Cond: (username = 'testing'::text)
        ->  Bitmap Index Scan on index_events_fresh_x_event  (cost=0.00..124112.84 rows=6328206 width=0) (actual time=2478.919..2478.919 rows=6245841 loops=1)
              Index Cond: (x_event = 'play'::text)
Total runtime: 6148.313 ms

我对此有几个问题：

我对日期索引中有 554137 行是否正确？那里应该有少于 50 个日期。
我怎么知道它正在使用列出的三个索引？
列出的总运行时间约为 6 秒，但当我运行不带 EXPLAIN ANALYZE 的查询时，大约需要 40 秒。

score 1 · Accepted Answer

如果 5.7 秒不够好，您可以尝试多列索引：

create index index_name on events(user_name, date, x_event)

我将 user_name 放在第一位，因为我猜它是具有最高基数的列。

score 1 · Accepted Answer

首先，正如 Scott Marlowe 所说，查询只需要 6 秒即可运行，剩下的就是传输时间。没有解释分析似乎更慢，因为结果比解释分析输出的十行大得多，因此传输时间更长。如果您打开查询日志并运行此查询，您可能会在日志中发现没有解释分析的查询运行得更快（解释分析会减慢速度）。顺便说一句，如果您使用的是 pgadmin，它本身就很慢。

至于日期索引pg中的行数是对的。即使您只有 50 个不同的值，所有行都将在索引中。当然，btree 部分本身仅包含 50 个不同的值，但在每个叶值下，它将具有该值的所有行的列表。当然存在带有 where 子句的索引的特殊情况，它只包含与 where 子句匹配的行，但我不希望您使用它对吗？

它使用了解释分析输出中列出的所有索引。在这种情况下，它将每个索引转换为一个位图，其中每一行的位集都与该索引扫描的标准相匹配。这三个位图可以很快地组合成一个包含组合标准结果的位图。

sql - Postgres 查询花费的时间比预期的要长，即使在字段上有索引

2 回答 2

Related

Reference