我是 postgres 的新手,我有一个关于在经常更新的列上使用部分索引的问题。
我有一张巨大的桌子:工作,其中有以下列。它有近 5000 万行。
表格和索引
CREATE TABLE job
(
id uuid,
assigned_at timestamp with time zone,
completed_at timestamp with time zone
)
该assigned_at
列将在某人获得该作业时更新,并且该completed_at
列将在作业完成时更新。所以表会经常更新。
我试图创建一个部分索引,如下所示:
CREATE INDEX idx ON job (c_id) WHERE ((assigned_at IS NOT NULL) AND (completed_at IS NULL));
更新查询
现在我想清除已分配超过 10 天的作业的分配。这是我的查询。执行需要很长时间:
update table set assigned_at = null where completed_at is null and (now() - assigned_at) > INTERVAL '10 days'
该索引在测试环境中运行良好,但在在线环境中不使用。我想知道在线环境的频繁操作是否阻止了部分索引的使用?以及如何加快更新查询的速度?
如果有人对此有任何想法,将不胜感激。谢谢。
解释分析:
- 在测试环境中:
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------
Update on job (cost=1592.47..1790362.54 rows=6083164 width=156) (actual time=31.447..31.448 rows=0 loops=1)
-> Bitmap Heap Scan on job (cost=1592.47..1790362.54 rows=6083164 width=156) (actual time=1.180..6.174 rows=494 loops=1)
Recheck Cond: ((assigned_at IS NOT NULL) AND (completed_at IS NULL))
Filter: (assigned_at < (now() - '10 days'::interval))
Rows Removed by Filter: 2585
Heap Blocks: exact=2698
-> Bitmap Index Scan on idx (cost=0.00..71.67 rows=7446475 width=0) (actual time=0.839..0.839 rows=3079 loops=1)
Planning Time: 0.238 ms
Execution Time: 31.487 ms
(9 rows)
- 在在线环境中:
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Update on job (cost=0.00..1773961.63 rows=2275384 width=156) (actual time=56346.519..56346.521 rows=0 loops=1)
-> Seq Scan on job (cost=0.00..1773961.63 rows=2275384 width=156) (actual time=0.212..55583.427 rows=693 loops=1)
Filter: ((assigned_at IS NOT NULL) AND (completed_at IS NULL) AND ((now() - assigned_at) > '10 days'::interval))
Rows Removed by Filter: 47839353
Planning Time: 0.640 ms
Execution Time: 56346.582 ms