cassandra - 当数据增长时，带有分页的 Cassandra 查询第二个索引变得更慢

Question

当我使用分页查询二级索引时，当数据增长时查询会变慢。
我认为使用分页，无论您的数据增长多大，查询一页都需要相同的时间。真的吗？为什么我的查询变慢了？

我的简化表是

CREATE TABLE closed_executions (
  domain_id            uuid,
  workflow_id          text,
  start_time           timestamp,
  workflow_type_name   text,
  PRIMARY KEY  ((domain_id), start_time)
) WITH CLUSTERING ORDER BY (start_time DESC)
  AND COMPACTION = {
    'class': 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'
  }
  AND GC_GRACE_SECONDS = 172800;

我创建了一个二级索引

CREATE INDEX closed_by_type ON closed_executions (workflow_type_name);

我用以下 CQL 查询

SELECT workflow_id, start_time, workflow_type_name 
FROM closed_executions 
WHERE domain_id = ? 
AND start_time >= ? 
AND start_time <= ? 
AND workflow_type_name = ?

和代码

query := v.session.Query(templateGetClosedWorkflowExecutionsByType,
        request.DomainUUID,
        common.UnixNanoToCQLTimestamp(request.EarliestStartTime),
        common.UnixNanoToCQLTimestamp(request.LatestStartTime),
        request.WorkflowTypeName).Consistency(gocql.One)
iter := query.PageSize(request.PageSize).PageState(request.NextPageToken).Iter()
// PageSize is 10, but could be thousand

环境：

MacBook Pro
卡桑德拉：3.11.0
GoCql：github.com/gocql/gocql master

观察：
10K 行，秒内
100K 行，~3 秒
1M 行，~17 秒

调试日志：

INFO  [ScheduledTasks:1] 2018-09-11 16:29:48,349 NoSpamLogger.java:91 - Some operations were slow, details available at debug level (debug.log)
DEBUG [ScheduledTasks:1] 2018-09-11 16:29:48,357 MonitoringTask.java:173 - 1 operations were slow in the last 5005 msecs:
<SELECT * FROM cadence_visibility.closed_executions WHERE workflow_type_name = code.uber.internal/devexp/cadence-bench/load/basic.stressWorkflowExecute AND token(domain_id, domain_partition) >= token(d3138e78-abe7-48a0-adb9-8c466a9bb3fa, 0) AND token(domain_id, domain_partition) <= token(d3138e78-abe7-48a0-adb9-8c466a9bb3fa, 0) AND start_time >= 2018-09-11 16:29-0700 AND start_time <= 1969-12-31 16:00-0800 LIMIT 10>, time 2747 msec - slow timeout 500 msec
DEBUG [COMMIT-LOG-ALLOCATOR] 2018-09-11 16:31:47,774 AbstractCommitLogSegmentManager.java:107 - No segments in reserve; creating a fresh one
DEBUG [ScheduledTasks:1] 2018-09-11 16:40:22,922 ColumnFamilyStore.java:899 - Enqueuing flush of size_estimates: 23.997MiB (2%) on-heap, 0.000KiB (0%) off-heap

相关参考（我的问题没有答案）：

-- 编辑 tablestats 返回

Total number of tables: 105

----------------
Keyspace : cadence_visibility
    Read Count: 19
    Read Latency: 0.5125263157894736 ms.
    Write Count: 3220964
    Write Latency: 0.04900822269357869 ms.
    Pending Flushes: 0
        Table: closed_executions
        SSTable count: 1
        SSTables in each level: [1, 0, 0, 0, 0, 0, 0, 0, 0]
        Space used (live): 20.3 MiB
        Space used (total): 20.3 MiB
        Space used by snapshots (total): 0 bytes
        Off heap memory used (total): 6.35 KiB
        SSTable Compression Ratio: 0.40192660515179696
        Number of keys (estimate): 3
        Memtable cell count: 28667
        Memtable data size: 7.35 MiB
        Memtable off heap memory used: 0 bytes
        Memtable switch count: 9
        Local read count: 9
        Local read latency: NaN ms
        Local write count: 327024
        Local write latency: NaN ms
        Pending flushes: 0
        Percent repaired: 0.0
        Bloom filter false positives: 0
        Bloom filter false ratio: 0.00000
        Bloom filter space used: 16 bytes
        Bloom filter off heap memory used: 8 bytes
        Index summary off heap memory used: 38 bytes
        Compression metadata off heap memory used: 6.3 KiB
        Compacted partition minimum bytes: 150
        Compacted partition maximum bytes: 62479625
        Compacted partition mean bytes: 31239902
        Average live cells per slice (last five minutes): NaN
        Maximum live cells per slice (last five minutes): 0
        Average tombstones per slice (last five minutes): NaN
        Maximum tombstones per slice (last five minutes): 0
        Dropped Mutations: 0 bytes

----------------

score 0 · Accepted Answer

为什么分页不能作为主表进行缩放？
您的二级索引中的数据是分散分页将只应用逻辑，直到它达到页码，因为您的数据不是按时间聚集的，您仍然必须筛选大量行，然后才能找到前 10 个。

查询跟踪确实显示分页播放在很晚的阶段。

为什么二级索引很慢？
首先，Cassandra 读取索引表以检索所有匹配行的主键，并且对于每一个匹配行，它将读取原始表以获取数据。它是已知的具有低基数索引的反模式。（参考https://www.datastax.com/dev/blog/cassandra-native-secondary-index-deep-dive）

cassandra - 当数据增长时，带有分页的 Cassandra 查询第二个索引变得更慢

1 回答 1

Related

Reference