我有一个带有外键的表和最近更新行的时间戳。具有相同外键值的行在大致相同的时间更新,正负一个小时。我在(foreign_key,timestamp)上有一个索引。这是在 postgresql 11 上。
当我进行如下查询时:
select * from table where foreign_key = $1 and timestamp > $2 order by primary_key;
如果时间戳查询在整个表中是选择性的,它将使用我的索引。但是如果时间戳在过去足够远以至于大多数行都匹配,它将扫描 primary_key 索引,假设它会更快。如果我删除订单,这个问题就会消失。
我查看了 Postgresql 的CREATE STATISTICS
,但在相关性超过一系列值(如时间戳加或减五分钟)而不是特定值的情况下,它似乎没有帮助。
解决此问题的最佳方法是什么?我可以删除订单,但这会使业务逻辑复杂化。我可以根据外键 id 对表进行分区,但这也是一个非常昂贵的更改。
规格:
Table "public.property_home_attributes"
Column | Type | Collation | Nullable | Default
----------------------+-----------------------------+-----------+----------+------------------------------------------------------
id | integer | | not null | nextval('property_home_attributes_id_seq'::regclass)
mls_id | integer | | not null |
property_id | integer | | not null |
formatted_attributes | jsonb | | not null |
created_at | timestamp without time zone | | |
updated_at | timestamp without time zone | | |
Indexes:
"property_home_attributes_pkey" PRIMARY KEY, btree (id)
"index_property_home_attributes_on_property_id" UNIQUE, btree (property_id)
"index_property_home_attributes_on_updated_at" btree (updated_at)
"property_home_attributes_mls_id_updated_at_idx" btree (mls_id, updated_at)
该表有大约 1600 万行。
psql=# EXPLAIN ANALYZE SELECT * FROM property_home_attributes WHERE mls_id = 46 AND (property_home_attributes.updated_at < '2019-10-30 16:52:06.326774') ORDER BY id ASC LIMIT 1000;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.56..10147.83 rows=1000 width=880) (actual time=1519.718..22310.674 rows=1000 loops=1)
-> Index Scan using property_home_attributes_pkey on property_home_attributes (cost=0.56..6094202.57 rows=600576 width=880) (actual time=1519.716..22310.398 rows=1000 loops=1)
Filter: ((updated_at < '2019-10-30 16:52:06.326774'::timestamp without time zone) AND (mls_id = 46))
Rows Removed by Filter: 358834
Planning Time: 0.110 ms
Execution Time: 22310.842 ms
(6 rows)
然后没有订单:
psql=# EXPLAIN ANALYZE SELECT * FROM property_home_attributes WHERE mls_id = 46 AND (property_home_attributes.updated_at < '2019-10-30 16:52:06.326774') LIMIT 1000;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.56..1049.38 rows=1000 width=880) (actual time=0.053..162.081 rows=1000 loops=1)
-> Index Scan using foo on property_home_attributes (cost=0.56..629893.60 rows=600576 width=880) (actual time=0.053..161.992 rows=1000 loops=1)
Index Cond: ((mls_id = 46) AND (updated_at < '2019-10-30 16:52:06.326774'::timestamp without time zone))
Planning Time: 0.100 ms
Execution Time: 162.140 ms
(5 rows)