sql - Is count(*) constant time in SQLite, and if not what are alternatives?

Question

I'm looking for the best way to count how many rows there are in a large (15 million+ rows) table. The naive way of select count(*) from table; is apparently O(n) according to a few older posts I've found on the matter, e.g. http://osdir.com/ml/sqlite-users/2010-07/msg00437.html.

Is there a constant time mechanism to get this information, or failing that are there preferred alternatives to the straightforward select count(*) query?

score 5 · Accepted Answer

SQLite 对COUNT(*)无WHERE子句进行了特殊优化，它遍历表的 B 树页面并计算条目而不实际加载记录。但是，这仍然需要访问所有表的数据（大记录的溢出页除外），因此运行时间仍然是 O(n)。

SQLite 不会在数据库中存储单独的记录计数，因为这会使所有更改变慢。

score 3 · Accepted Answer

不，这不是恒定的时间。

sqlite> CREATE TABLE test ( a );
sqlite> EXPLAIN QUERY PLAN SELECT COUNT(*) FROM test;
0|0|0|SCAN TABLE test (~1000000 rows)
sqlite> EXPLAIN QUERY PLAN SELECT COUNT(1) FROM test;
0|0|0|SCAN TABLE test (~1000000 rows)

您可以使用它EXPLAIN QUERY PLAN SELECT ...来了解查询的性能。

score 2 · Accepted Answer

作为一种解决方法，您可以查询ROWID。如果你不从表中删除它会是准确的，否则它会很高

select max(rowid) from table

score 1 · Accepted Answer

虽然同意它不是恒定时间的其他答案，但一个有趣且不明显的性能改进select count(*)是在没有索引的情况下添加索引。这可以在任意列上，并且在我的系统上将查询时间减少了 75%（ish）。

sqlite> select count(*) from TestTable;
15035000
CPU Time: user 0.468003 sys 4.368028

sqlite> select count(*) from TestTable;
15035000
CPU Time: user 0.561604 sys 4.290027

sqlite> select count(*) from TestTable;
15035000
CPU Time: user 0.483603 sys 4.368028

sqlite> explain query plan select count(*) from TestTable;
0|0|0|SCAN TABLE TestTable (~1000000 rows)

sqlite> create index test_index on TestTable(Pointer);

sqlite> select count(*) from TestTable;
15035000
CPU Time: user 0.062400 sys 0.780005

sqlite> select count(*) from TestTable;
15035000
CPU Time: user 0.187201 sys 0.655204

sqlite> select count(*) from TestTable;
15035000
CPU Time: user 0.140401 sys 0.748805

sqlite> explain query plan select count(*) from TestTable;
0|0|0|SCAN TABLE TestTable USING COVERING INDEX test_index(~1000000 rows)

score 1 · Accepted Answer

这是一个很好的问题。我希望有一个包含表行数的 SQLite 目录。select count(*) from table;是你最好的选择 O(n)。您可以检查与select count(1) from table;相比的性能count(*)。我推测count(1) 和 count( * )都会给你相似的速度。不幸的是，在 SQLite 中，没有任何替代 count(*) 来获取行数。

另一方面，SQL Server 具有非常有用的 sys.dm_db_partition_stats。

score 0 · Accepted Answer

我相信您可以sqlite_stat1在运行后使用该表来检索表中的行数ANALYZE table：

此列表中的第一个整数是索引和表中的行数。

此表中的统计信息不会随您的数据一起更新，因此随着表的更改，它们会变得不那么准确。这有多有用取决于您的用例。

可能ANALYZE需要与相同的时间COUNT(*)，但是它会为您缓存结果（和一些其他统计数据）。

sql - Is count(*) constant time in SQLite, and if not what are alternatives?

6 回答 6

Related

Reference