1

我想查询一个(列表)值或 NULL 但不使用 OR。尝试不使用 OR 背后的原因是,我需要在该字段上使用索引来加快查询速度。

一个简单的例子来说明我的问题:

CREATE TABLE fruits
(
  name text,
  quantity integer
);

(真正的表有很多额外的整数列。)

我不满意的查询是

SELECT * FROM fruits WHERE quantity IN (1,2,3,4) OR quantity IS NULL;

我希望的查询类似于

SELECT * FROM fruits WHERE quantity MAGIC (1,2,3,4,NULL);

我正在使用 Postgresql 9.1。

据我从文档(例如http://www.postgresql.org/docs/9.1/static/functions-comparisons.html)和测试中可以看出,没有办法做到这一点。但我希望你们中的一个人有一些神奇的洞察力。

4

3 回答 3

1

丑陋的黑客COALESCE

SELECT * 
FROM fruits
 WHERE COALESCE(quantity,1) IN (1,2,3,4)
   ;

请检查生成的计划。IIRC,优化器知道COALESCE()在这种情况下。

更新:替代方案:使用EXISTS(NOT EXISTS(NOT IN))技巧(在此处生成不同的计划)

-- EXPLAIN ANALYZE
SELECT *
FROM fruits fr
WHERE EXISTS (
        SELECT * FROM fruits ex
        WHERE ex.id = fr.id
        AND NOT EXISTS (
        SELECT * FROM fruits nx
                WHERE nx.id = ex.id
                AND nx.quantity NOT IN (1,2,3,4)
                )
        )
   ;

顺便说一句:在测试时,(最多 100 万行,只有 4+ 几个符合条件),第一个查询(不使用索引)总是比第二个查询(使用索引和哈希反连接)YMMV 快。

更新 2:原始查询IS NULL OR IN()在这里是一个明显的赢家:

-- EXPLAIN ANALYZE
SELECT *
FROM fruits
 WHERE quantity IS NULL
    OR quantity IN (1,2,3,4)
   ;
于 2013-05-09T12:39:18.920 回答
1

具有 100k 行的测试表:

create table fruits (name text, quantity integer);
insert into fruits (name, quantity)
select left(md5(i::text), 6), i
from generate_series(1, 10000) s(i);

使用简单的数量索引:

create index fruits_index on fruits(quantity);
analyze fruits;

查询or

explain analyze
SELECT * FROM fruits WHERE quantity IN (1,2,3,4) OR quantity IS NULL;
                                                         QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on fruits  (cost=21.29..34.12 rows=4 width=11) (actual time=0.032..0.032 rows=4 loops=1)
   Recheck Cond: ((quantity = ANY ('{1,2,3,4}'::integer[])) OR (quantity IS NULL))
   ->  BitmapOr  (cost=21.29..21.29 rows=4 width=0) (actual time=0.025..0.025 rows=0 loops=1)
         ->  Bitmap Index Scan on fruits_index  (cost=0.00..17.03 rows=4 width=0) (actual time=0.019..0.019 rows=4 loops=1)
               Index Cond: (quantity = ANY ('{1,2,3,4}'::integer[]))
         ->  Bitmap Index Scan on fruits_index  (cost=0.00..4.26 rows=1 width=0) (actual time=0.004..0.004 rows=0 loops=1)
               Index Cond: (quantity IS NULL)
 Total runtime: 0.089 ms

没有or

explain analyze
SELECT * FROM fruits WHERE quantity IN (1,2,3,4);
                                                      QUERY PLAN                                                       
-----------------------------------------------------------------------------------------------------------------------
 Index Scan using fruits_index on fruits  (cost=0.00..21.07 rows=4 width=11) (actual time=0.026..0.038 rows=4 loops=1)
   Index Cond: (quantity = ANY ('{1,2,3,4}'::integer[]))
 Total runtime: 0.085 ms

wildplasser 提出的合并版本导致顺序扫描:

explain analyze
SELECT * 
FROM fruits
WHERE COALESCE(quantity, -1) IN (-1,1,2,3,4);
                                             QUERY PLAN                                              
-----------------------------------------------------------------------------------------------------
 Seq Scan on fruits  (cost=0.00..217.50 rows=250 width=11) (actual time=0.023..4.358 rows=4 loops=1)
   Filter: (COALESCE(quantity, (-1)) = ANY ('{-1,1,2,3,4}'::integer[]))
   Rows Removed by Filter: 9996
 Total runtime: 4.395 ms

除非创建合并表达式索引:

create index fruits_coalesce_index on fruits(coalesce(quantity, -1));
analyze fruits;

explain analyze
SELECT * 
FROM fruits
WHERE COALESCE(quantity, -1) IN (-1,1,2,3,4);
                                                           QUERY PLAN                                                           
--------------------------------------------------------------------------------------------------------------------------------
 Index Scan using fruits_coalesce_index on fruits  (cost=0.00..25.34 rows=5 width=11) (actual time=0.112..0.124 rows=4 loops=1)
   Index Cond: (COALESCE(quantity, (-1)) = ANY ('{-1,1,2,3,4}'::integer[]))
 Total runtime: 0.172 ms

但它仍然比使用简单or数量索引的普通查询差。

于 2013-05-09T13:01:44.330 回答
0

这不是您确切问题的答案,但您可以为您的查询构建一个部分索引:

CREATE INDEX idx_partial (quantity) ON fruits
WHERE quantity IN (1,2,3,4) OR quantity IS NULL;

来自文档:http ://www.postgresql.org/docs/current/interactive/indexes-partial.html

然后,您的查询应使用此索引并加快速度。

于 2013-05-09T12:49:14.447 回答