cassandra - Cassandra 1.2 合并来自 memtables 和 sstables 的数据需要很长时间

Question

这是来自运行 1.2.6 的 4 节点 cassandra 集群的跟踪。当集群处于空载状态时，我看到一个简单的选择超时，我需要一些帮助才能深入了解它。

 activity                                                                | timestamp    | source        | source_elapsed
-------------------------------------------------------------------------+--------------+---------------+----------------
                                                      execute_cql3_query | 05:21:00,848 | 100.69.176.51 |              0
 Parsing select * from user_scores where user_id='26257166' LIMIT 10000; | 05:21:00,848 | 100.69.176.51 |             77
                                                      Peparing statement | 05:21:00,848 | 100.69.176.51 |            225
                         Executing single-partition query on user_scores | 05:21:00,849 | 100.69.176.51 |            589
                                            Acquiring sstable references | 05:21:00,849 | 100.69.176.51 |            626
                                             Merging memtable tombstones | 05:21:00,849 | 100.69.176.51 |            676
                                            Key cache hit for sstable 34 | 05:21:00,849 | 100.69.176.51 |            817
                             Seeking to partition beginning in data file | 05:21:00,849 | 100.69.176.51 |            836
                                            Key cache hit for sstable 32 | 05:21:00,849 | 100.69.176.51 |           1135
                             Seeking to partition beginning in data file | 05:21:00,849 | 100.69.176.51 |           1153
                              Merging data from memtables and 2 sstables | 05:21:00,850 | 100.69.176.51 |           1394
                                                        Request complete | 05:21:20,881 | 100.69.176.51 |       20033807

这是架构。您可以看到它包含一些集合。

create table user_scores
(
    user_id varchar,
    post_type varchar,
    score double,
    team_to_score_map map<varchar, double>,
    affiliation_to_score_map map<varchar, double>,
    campaign_to_score_map map<varchar, double>,
    person_to_score_map map<varchar, double>,
    primary key(user_id, post_type)
)
with compaction =
{
  'class' : 'LeveledCompactionStrategy',
  'sstable_size_in_mb' : 10
};

我添加了分级压缩策略，因为它应该有助于降低读取延迟。

我想了解在合并阶段可能导致集群超时的原因。并非所有查询都超时。对于具有大量条目的映射的行，它似乎更频繁地发生。

这是另一个很好的措施失败的痕迹。这是非常可重复的：

 activity                                                                | timestamp    | source         | source_elapsed
-------------------------------------------------------------------------+--------------+----------------+----------------
                                                      execute_cql3_query | 05:51:34,557 |  100.69.176.51 |              0
                                    Message received from /100.69.176.51 | 05:51:34,195 | 100.69.184.134 |            102
                         Executing single-partition query on user_scores | 05:51:34,199 | 100.69.184.134 |           3512
                                            Acquiring sstable references | 05:51:34,199 | 100.69.184.134 |           3741
                                             Merging memtable tombstones | 05:51:34,199 | 100.69.184.134 |           3890
                                             Key cache hit for sstable 5 | 05:51:34,199 | 100.69.184.134 |           4040
                             Seeking to partition beginning in data file | 05:51:34,199 | 100.69.184.134 |           4059
                              Merging data from memtables and 1 sstables | 05:51:34,200 | 100.69.184.134 |           4412
 Parsing select * from user_scores where user_id='26257166' LIMIT 10000; | 05:51:34,558 |  100.69.176.51 |             91
                                                      Peparing statement | 05:51:34,558 |  100.69.176.51 |            238
                               Enqueuing data request to /100.69.184.134 | 05:51:34,558 |  100.69.176.51 |            567
                                      Sending message to /100.69.184.134 | 05:51:34,558 |  100.69.176.51 |            979
                                                        Request complete | 05:51:54,562 |  100.69.176.51 |       20005209

以及它何时工作的痕迹：

 activity                                                                 | timestamp    | source         | source_elapsed
--------------------------------------------------------------------------+--------------+----------------+----------------
                                                       execute_cql3_query | 05:55:07,772 |  100.69.176.51 |              0
                                     Message received from /100.69.176.51 | 05:55:07,408 | 100.69.184.134 |             53
                          Executing single-partition query on user_scores | 05:55:07,409 | 100.69.184.134 |           1014
                                             Acquiring sstable references | 05:55:07,409 | 100.69.184.134 |           1087
                                              Merging memtable tombstones | 05:55:07,410 | 100.69.184.134 |           1209
                       Partition index with 0 entries found for sstable 5 | 05:55:07,410 | 100.69.184.134 |           1681
                              Seeking to partition beginning in data file | 05:55:07,410 | 100.69.184.134 |           1732
                               Merging data from memtables and 1 sstables | 05:55:07,411 | 100.69.184.134 |           2415
                                       Read 1 live and 0 tombstoned cells | 05:55:07,412 | 100.69.184.134 |           3274
                                     Enqueuing response to /100.69.176.51 | 05:55:07,412 | 100.69.184.134 |           3534
                                        Sending message to /100.69.176.51 | 05:55:07,412 | 100.69.184.134 |           3936
 Parsing select * from user_scores where user_id='305722020' LIMIT 10000; | 05:55:07,772 |  100.69.176.51 |             96
                                                       Peparing statement | 05:55:07,772 |  100.69.176.51 |            262
                                Enqueuing data request to /100.69.184.134 | 05:55:07,773 |  100.69.176.51 |            600
                                       Sending message to /100.69.184.134 | 05:55:07,773 |  100.69.176.51 |            847
                                    Message received from /100.69.184.134 | 05:55:07,778 |  100.69.176.51 |           6103
                                 Processing response from /100.69.184.134 | 05:55:07,778 |  100.69.176.51 |           6341
                                                         Request complete | 05:55:07,778 |  100.69.176.51 |           6780

score 1 · Accepted Answer

看起来我遇到了 1.2 的性能问题。幸运的是，一个补丁刚刚应用于 1.2 分支，所以当我从源代码构建时，我的问题就消失了。

有关详细说明，请参阅https://issues.apache.org/jira/browse/CASSANDRA-5677。

cassandra - Cassandra 1.2 合并来自 memtables 和 sstables 的数据需要很长时间

1 回答 1

Related

Reference