我在 postgresql 中创建了一个包含复合主键(3 列)的表。如果在查询中使用不包含前导列的子集,则默认索引不会被使用。如果我们显式创建索引(索引将用于任何子集),情况并非如此。
默认情况下,postgres 会在主键上创建一个索引。但正如postgres 文件所说
A multicolumn B-tree index can be used with query conditions that involve any subset of
the index's columns, but the index is most efficient when there are constraints on the leading (leftmost) columns.
如果查询不包括前导列,那么也将使用索引(如果我们显式创建索引),但是当我们尝试使用默认主键索引的子集时,不会使用索引。
以下是不适用于子集的架构和查询。
# \d client_data
Table "public.client_data"
Column | Type | Modifiers
--------------------+-----------------------+-----------
macaddr | character varying(64) | not null
ts | bigint | not null
interval | smallint | not null
snr | smallint | not null
rx_rate | bigint |
tx_rate | bigint |
rx_data | bigint |
tx_data | bigint |
Indexes:
"client_data_pkey" PRIMARY KEY, btree (macaddr, ts, interval)
如果我们指定所有主键列,那么查询计划器将使用索引
# explain analyze select count(*) from client_data where macaddr='a:b:c' and ts=346783556 and interval=5;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=8.60..8.61 rows=1 width=0) (actual time=0.040..0.041 rows=1 loops=1)
-> Index Scan using client_data_pkey on client_data (cost=0.00..8.59 rows=1 width=0) (actual time=0.037..0.037 rows=0 loops=1)
Index Cond: (((macaddr)::text = 'a:b:c'::text) AND (ts = 346783556) AND ("interval" = 5))
Total runtime: 0.096 ms
(4 rows)
但是如果我们指定子集,查询规划器将不会使用索引
# explain analyze select count(*) from client_data where ts=346783556;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
Aggregate (cost=16176.01..16176.02 rows=1 width=0) (actual time=78.937..78.938 rows=1 loops=1)
-> Seq Scan on client_data (cost=0.00..16175.92 rows=36 width=0) (actual time=78.932..78.932 rows=0 loops=1)
Filter: (ts = 346783556)
Total runtime: 78.975 ms
(4 rows)
# explain analyze select count(*) from client_data where ts=346783556 and interval=5;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------
Aggregate (cost=17639.11..17639.12 rows=1 width=0) (actual time=78.815..78.815 rows=1 loops=1)
-> Seq Scan on client_data (cost=0.00..17639.11 rows=1 width=0) (actual time=78.810..78.810 rows=0 loops=1)
Filter: ((ts = 346783556) AND ("interval" = 5))
Total runtime: 78.853 ms
(4 rows)
但是,如果我们使用带有 ts 或间隔的前导列(macaddr),则将使用索引。
# explain analyze select count(*) from client_data where macaddr='a' and ts=346783556;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=8.59..8.60 rows=1 width=0) (actual time=0.055..0.056 rows=1 loops=1)
-> Index Scan using client_data_pkey on client_data (cost=0.00..8.59 rows=1 width=0) (actual time=0.051..0.051 rows=0 loops=1)
Index Cond: (((macaddr)::text = 'a'::text) AND (ts = 346783556))
Total runtime: 0.103 ms
(4 rows)
# explain analyze select count(*) from client_data where macaddr='a' and interval=56;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=56.15..56.16 rows=1 width=0) (actual time=0.021..0.022 rows=1 loops=1)
-> Index Scan using client_data_pkey on client_data (cost=0.00..56.15 rows=1 width=0) (actual time=0.017..0.017 rows=0 loops=1)
Index Cond: (((macaddr)::text = 'a'::text) AND ("interval" = 56))
Total runtime: 0.055 ms
(4 rows)