是什么让糟糕的行估计成为 SQL 查询性能的痛点?我很想知道内部原因。
通常,错误的行估计实际上会选择正确的计划,而良好查询和错误查询之间的唯一区别将是估计的行数。
为什么经常会出现如此巨大的性能差异?
是因为 Postgres 使用行估计来分配内存吗?
是什么让糟糕的行估计成为 SQL 查询性能的痛点?我很想知道内部原因。
通常,错误的行估计实际上会选择正确的计划,而良好查询和错误查询之间的唯一区别将是估计的行数。
为什么经常会出现如此巨大的性能差异?
是因为 Postgres 使用行估计来分配内存吗?
Postgresql优化器是一个基于成本的优化器(CBO),查询会按照执行计划中最小的成本来执行,成本会根据表的统计来计算。
为什么在 Postgres 中错误的行估计很慢?
因为错误的统计数据可能会选择错误的执行计划。这是一个例子
有两个表,T1
有 20000000 行,T2
有 1000000 行。
CREATE TABLE T1 (
ID INT NOT NULL PRIMARY KEY,
val INT NOT NULL,
col1 UUID NOT NULL,
col2 UUID NOT NULL,
col3 UUID NOT NULL,
col4 UUID NOT NULL,
col5 UUID NOT NULL,
col6 UUID NOT NULL
);
INSERT INTO T1
SELECT i,
RANDOM() * 1000000,
md5(random()::text || clock_timestamp()::text)::uuid,
md5(random()::text || clock_timestamp()::text)::uuid,
md5(random()::text || clock_timestamp()::text)::uuid,
md5(random()::text || clock_timestamp()::text)::uuid,
md5(random()::text || clock_timestamp()::text)::uuid,
md5(random()::text || clock_timestamp()::text)::uuid
FROM generate_series(1,20000000) i;
CREATE TABLE T2 (
ID INT NOT NULL PRIMARY KEY,
val INT NOT NULL,
col1 UUID NOT NULL,
col2 UUID NOT NULL,
col3 UUID NOT NULL,
col4 UUID NOT NULL,
col5 UUID NOT NULL,
col6 UUID NOT NULL
);
INSERT INTO T2
SELECT i,
RANDOM() * 1000000,
md5(random()::text || clock_timestamp()::text)::uuid,
md5(random()::text || clock_timestamp()::text)::uuid,
md5(random()::text || clock_timestamp()::text)::uuid,
md5(random()::text || clock_timestamp()::text)::uuid,
md5(random()::text || clock_timestamp()::text)::uuid,
md5(random()::text || clock_timestamp()::text)::uuid
FROM generate_series(1,1000000) i;
当我们join
在表上做时,我们会得到一个可能使用的执行计划Merge JOIN
EXPLAIN (ANALYZE,TIMING ON,BUFFERS ON)
SELECT t1.*
FROM T1
INNER JOIN T2 ON t1.id = t2.id
WHERE t1.id < 1000000
"Gather (cost=1016.37..30569.85 rows=53968 width=104) (actual time=0.278..837.297 rows=999999 loops=1)"
" Workers Planned: 2"
" Workers Launched: 2"
" Buffers: shared hit=38273 read=21841"
" -> Merge Join (cost=16.37..24173.05 rows=22487 width=104) (actual time=11.993..662.770 rows=333333 loops=3)"
" Merge Cond: (t2.id = t1.id)"
" Buffers: shared hit=38273 read=21841"
" -> Parallel Index Only Scan using t2_pkey on t2 (cost=0.42..20147.09 rows=416667 width=4) (actual time=0.041..69.947 rows=333333 loops=3)"
" Heap Fetches: 0"
" Buffers: shared hit=6 read=2732"
" -> Index Scan using t1_pkey on t1 (cost=0.44..48427.24 rows=1079360 width=104) (actual time=0.041..329.874 rows=999819 loops=3)"
" Index Cond: (id < 1000000)"
" Buffers: shared hit=38267 read=19109"
"Planning:"
" Buffers: shared hit=4 read=8"
"Planning Time: 0.228 ms"
"Execution Time: 906.760 ms"
但是当我如下更新很多行时,100000000
当 id 小于时,让 id 加上1000000
update T1
set id = id + 100000000
where id < 1000000
我们再次使用相同的查询,它将使用Merge JOIN
,但应该有另一个更好的选择而不是Merge JOIN
。
如果您没有达到 autovacuum_analyze_threshold (autovacuum_analyze_threshold
默认值0.1
意味着我们需要创建多个10%
死元组 postgresql 将自动更新统计信息)
EXPLAIN (ANALYZE,TIMING ON,BUFFERS ON)
SELECT t1.*
FROM T1
INNER JOIN T2 ON t1.id = t2.id
WHERE t1.id < 1000000
"Gather (cost=1016.37..30707.83 rows=53968 width=104) (actual time=51.403..55.517 rows=0 loops=1)"
" Workers Planned: 2"
" Workers Launched: 2"
" Buffers: shared hit=8215"
" -> Merge Join (cost=16.37..24311.03 rows=22487 width=104) (actual time=6.736..6.738 rows=0 loops=3)"
" Merge Cond: (t2.id = t1.id)"
" Buffers: shared hit=8215"
" -> Parallel Index Only Scan using t2_pkey on t2 (cost=0.42..20147.09 rows=416667 width=4) (actual time=0.024..0.024 rows=1 loops=3)"
" Heap Fetches: 0"
" Buffers: shared hit=8"
" -> Index Scan using t1_pkey on t1 (cost=0.44..50848.71 rows=1133330 width=104) (actual time=6.710..6.710 rows=0 loops=3)"
" Index Cond: (id < 1000000)"
" Buffers: shared hit=8207"
"Planning:"
" Buffers: shared hit=2745"
"Planning Time: 3.938 ms"
"Execution Time: 55.550 ms"
当我们使用手动ANALYZE T1;
表示更新T1
表统计信息时,再次查询会得到Nested Loop
比Merge JOIN
"QUERY PLAN"
"Nested Loop (cost=0.86..8.90 rows=1 width=104) (actual time=0.004..0.004 rows=0 loops=1)"
" Buffers: shared hit=3"
" -> Index Scan using t1_pkey on t1 (cost=0.44..4.46 rows=1 width=104) (actual time=0.003..0.003 rows=0 loops=1)"
" Index Cond: (id < 1000000)"
" Buffers: shared hit=3"
" -> Index Only Scan using t2_pkey on t2 (cost=0.42..4.44 rows=1 width=4) (never executed)"
" Index Cond: (id = t1.id)"
" Heap Fetches: 0"
"Planning:"
" Buffers: shared hit=20"
"Planning Time: 0.232 ms"
"Execution Time: 0.027 ms"
小结论:
表中的精确统计信息将帮助优化器通过精确的 COST 从表中获得正确的执行计划。
这是一个帮助我们搜索last_analyze
和last_vacuum
最后一次的脚本。
SELECT
schemaname, relname,
last_vacuum, last_autovacuum,
vacuum_count, autovacuum_count,
last_analyze,last_autoanalyze
FROM pg_stat_user_tables
where relname = 'tablename';
行数估计会影响优化器的进一步决策,因此错误的估计会导致错误的计划。
以我的经验,问题通常发生在决定正确加入策略的过程中:
当行数被低估时,PostgreSQL 可能会选择嵌套循环连接而不是散列或合并连接,但最终内部表的扫描频率比 PostgreSLQ 认为的要高,从而导致性能下降。
相反,如果 PostgreSQL 高估了行数,它可能会选择一个散列或合并连接并完全扫描两个表,这可能比内表上的几个索引扫描慢得多。
行数估计值用于计算不同计划的成本。当这些估计是这样时,计划的最终成本将意味着它最终使用了错误的计划。例如扫描一个表,因为它认为它需要表的重要部分,而实际上只需要几行,使用索引可以更快地检索这些行。