cassandra - Cassandra：高 CPU 使用率和无响应的数据库，可能是由于二级索引构建卡住 - 如何停止索引构建过程？

Question

在运行 Debian 的 VM 上单节点安装 Cassandra 3.7 时，我有一个包含大约 2000 万行的表。为了能够选择最近几天插入的数据，我使用 Datastax DevCenter 1.6.0 执行语句，在包含插入日期的列上创建二级索引：

CREATE INDEX srsdata_datetimeinserted ON ccp.srsdata(datetimeinserted);

语句本身运行得很快，然后，据我了解，索引创建过程在后台开始，其中一个内核的 CPU 负载接近 100%。问题是，这个 CPU 负载现在已经超过 24 小时，并且即使在虚拟机多次重新启动后再次启动。

为了检查索引创建过程，我运行了

nodetool compactionstats

但几乎从一开始它似乎就停留在 5.78% 并且在过去 24 小时内根本没有改变：

pending tasks: 1
- ccp.srsdata: 1

id                                   compaction type       keyspace table   completed total    unit  progress
2616e5d0-c217-11e6-bbed-073889a74ba2 Secondary index build ccp      srsdata 655814    11350989 bytes 5.78%
Active compaction remaining time :   0h00m00s

我可以从表中选择但不能插入数据，甚至不能进入其他表，然后我得到

"Cassandra timeout during write query at consistency ONE 
(1 replica were required but only 0 acknowledged the write)"

如果我尝试删除索引，

DROP INDEX srsdata_datetimeinserted;

我明白了

"Timed out waiting for server respones".

我试图使用停止索引构建

nodetool stop INDEX_BUILD

但这没有任何区别。

我该怎么做才能停止并重新启动索引创建？还是有其他一些我没有想到的东西在运行？

2017-01-12 更新

我从来没有停止索引创建过程，所以我最终从创建索引之前的备份中恢复了虚拟服务器。

我还发现了 Cassandra 3.4 中引入的新 SASI 索引（http://www.doanduyhai.com/blog/?p=2058），特别是 SPARSE 索引模式用于存储接近唯一的数据，例如毫秒时间戳。事实上，最多允许 5 个相同的值。所以我使用创建了一个 SASI 索引

CREATE CUSTOM INDEX srsdata_datetimeinserted ON ccp.srsdata (datetimeinserted) USING 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = { 'mode': 'SPARSE' };

创建花了大约 20 分钟，似乎工作正常，现在我可以进行如下查询

select * from ccp.srsdata where datetimeinserted >= '2017-01-01 00:00:00+0000' AND datetimeinserted < '2017-01-01 15:00:00+0000';

cassandra - Cassandra：高 CPU 使用率和无响应的数据库，可能是由于二级索引构建卡住 - 如何停止索引构建过程？

0 回答 0

Related

Reference