4

我有一个相当大的SQLite3数据库表,其中包含一个数字索引字段,我必须在其上搜索值范围的列表。由于数值是巨大的 64 位数字,IN因此不能选择子句。查询通常如下所示:

SELECT * FROM sometable WHERE ID BETWEEN 10 AND 11 
                           OR ID BETWEEN 20 AND 21 
                           OR ID BETWEEN 30 AND 31;

我经历了一个奇怪的性能限制。最多 9 个BETWEEN术语,查询速度非常快(ID字段已编入索引)。但是从 10 个词开始,查询变得慢了几个数量级!我没有在文档中找到对该限制的任何解释。

我发现该EXPLAIN QUERY PLAN指令可以用来查看行为的变化。我用SQLite 3.7.12 进行了实验,以防万一。

为了演示,让我们创建一个非常简单的空表:

CREATE TABLE sometable(name TEXT, ID INTEGER);
CREATE INDEX id_idx ON sometable (ID ASC);

这个查询:

EXPLAIN QUERY PLAN SELECT * FROM sometable WHERE ID BETWEEN 10 AND 11 
 OR ID BETWEEN 20 AND 21 OR ID BETWEEN 30 AND 31 OR ID BETWEEN 40 AND 41
 OR ID BETWEEN 50 AND 51 OR ID BETWEEN 60 AND 61 OR ID BETWEEN 70 AND 71
 OR ID BETWEEN 80 AND 81 OR ID BETWEEN 90 AND 91;     

产生这样的结果:

0|0|0|SEARCH TABLE sometable USING INDEX id_idx (ID>? AND ID<?) (~31250 rows)
0|0|0|SEARCH TABLE sometable USING INDEX id_idx (ID>? AND ID<?) (~31250 rows)
0|0|0|SEARCH TABLE sometable USING INDEX id_idx (ID>? AND ID<?) (~31250 rows)
0|0|0|SEARCH TABLE sometable USING INDEX id_idx (ID>? AND ID<?) (~31250 rows)
0|0|0|SEARCH TABLE sometable USING INDEX id_idx (ID>? AND ID<?) (~31250 rows)
0|0|0|SEARCH TABLE sometable USING INDEX id_idx (ID>? AND ID<?) (~31250 rows)
0|0|0|SEARCH TABLE sometable USING INDEX id_idx (ID>? AND ID<?) (~31250 rows)
0|0|0|SEARCH TABLE sometable USING INDEX id_idx (ID>? AND ID<?) (~31250 rows)
0|0|0|SEARCH TABLE sometable USING INDEX id_idx (ID>? AND ID<?) (~31250 rows)

而那个查询:

EXPLAIN QUERY PLAN SELECT * FROM sometable WHERE ID BETWEEN 10 AND 11 
 OR ID BETWEEN 20 AND 21 OR ID BETWEEN 30 AND 31 OR ID BETWEEN 40 AND 41
 OR ID BETWEEN 50 AND 51 OR ID BETWEEN 60 AND 61 OR ID BETWEEN 70 AND 71
 OR ID BETWEEN 80 AND 81 OR ID BETWEEN 90 AND 91 OR ID BETWEEN 100 AND 101;

产生这样的结果:

0|0|0|SCAN TABLE sometable (~500000 rows)

SCAN TABLE表示不使用索引,搜索整个原始表,导致性能不佳。

有没有办法(编译指示/编译开关/技巧)来避免这个限制?

4

2 回答 2

1

如您所见,SQLite 尝试将查询拆分为多个子查询,以便可以在索引中单独查找每个范围。

但是,当范围过多时,查询优化器会假定所有单个子查询的成本总和大于仅通过表一次。

如果您的范围包含少于 31250 行,或者您的表有超过 1000000 行,您可以尝试使用ANALYZE 命令来改进成本估算。

作为最后的手段,您可以手动拆分查询以强制单独查找:

SELECT * FROM sometable WHERE ID BETWEEN 10 AND 11
UNION ALL
SELECT * FROM sometable WHERE ID BETWEEN 20 AND 21
UNION ALL
SELECT * FROM sometable WHERE ID BETWEEN 30 AND 31 
...
于 2013-02-12T11:35:44.780 回答
0

无需在表中加载数据......我可以加载一个序列并尝试我猜。

SELECT * FROM sometable WHERE ID%10 BETWEEN 0 AND 1;

于 2018-08-03T17:41:44.810 回答