1

我有一个带有外键的表和最近更新行的时间戳。具有相同外键值的行在大致相同的时间更新,正负一个小时。我在(foreign_key,timestamp)上有一个索引。这是在 postgresql 11 上。

当我进行如下查询时:

select * from table where foreign_key = $1 and timestamp > $2 order by primary_key;

如果时间戳查询在整个表中是选择性的,它将使用我的索引。但是如果时间戳在过去足够远以至于大多数行都匹配,它将扫描 primary_key 索引,假设它会更快。如果我删除订单,这个问题就会消失。

我查看了 Postgresql 的CREATE STATISTICS,但在相关性超过一系列值(如时间戳加或减五分钟)而不是特定值的情况下,它似乎没有帮助。

解决此问题的最佳方法是什么?我可以删除订单,但这会使业务逻辑复杂化。我可以根据外键 id 对表进行分区,但这也是一个非常昂贵的更改。

规格:

                                            Table "public.property_home_attributes"
        Column        |            Type             | Collation | Nullable |                       Default
----------------------+-----------------------------+-----------+----------+------------------------------------------------------
 id                   | integer                     |           | not null | nextval('property_home_attributes_id_seq'::regclass)
 mls_id               | integer                     |           | not null |
 property_id          | integer                     |           | not null |
 formatted_attributes | jsonb                       |           | not null |
 created_at           | timestamp without time zone |           |          |
 updated_at           | timestamp without time zone |           |          |
Indexes:
    "property_home_attributes_pkey" PRIMARY KEY, btree (id)
    "index_property_home_attributes_on_property_id" UNIQUE, btree (property_id)
    "index_property_home_attributes_on_updated_at" btree (updated_at)
    "property_home_attributes_mls_id_updated_at_idx" btree (mls_id, updated_at)

该表有大约 1600 万行。

psql=# EXPLAIN ANALYZE SELECT * FROM property_home_attributes WHERE mls_id = 46 AND (property_home_attributes.updated_at < '2019-10-30 16:52:06.326774') ORDER BY id ASC LIMIT 1000;
                                                                                     QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.56..10147.83 rows=1000 width=880) (actual time=1519.718..22310.674 rows=1000 loops=1)
   ->  Index Scan using property_home_attributes_pkey on property_home_attributes  (cost=0.56..6094202.57 rows=600576 width=880) (actual time=1519.716..22310.398 rows=1000 loops=1)
         Filter: ((updated_at < '2019-10-30 16:52:06.326774'::timestamp without time zone) AND (mls_id = 46))
         Rows Removed by Filter: 358834
 Planning Time: 0.110 ms
 Execution Time: 22310.842 ms
(6 rows)

然后没有订单:

psql=# EXPLAIN ANALYZE SELECT * FROM property_home_attributes WHERE mls_id = 46 AND (property_home_attributes.updated_at < '2019-10-30 16:52:06.326774')  LIMIT 1000;
                                                                     QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.56..1049.38 rows=1000 width=880) (actual time=0.053..162.081 rows=1000 loops=1)
   ->  Index Scan using foo on property_home_attributes  (cost=0.56..629893.60 rows=600576 width=880) (actual time=0.053..161.992 rows=1000 loops=1)
         Index Cond: ((mls_id = 46) AND (updated_at < '2019-10-30 16:52:06.326774'::timestamp without time zone))
 Planning Time: 0.100 ms
 Execution Time: 162.140 ms
(5 rows)
4

1 回答 1

1

如果你想阻止 PostgreSQL 使用索引扫描property_home_attributes_pkey来支持ORDER BY,你可以简单地使用

ORDER BY primary_key + 0
于 2019-12-06T08:07:31.903 回答