4

I have a Python application using a Cassandra 1.2 cluster. The cluster has 7 physical nodes using virtual nodes, and a replication factor of 3 for 1 of the keyspaces and replication factor of 1 for another. The app uses the cql library to connect to Cassandra and run queries. The problem is that I've started getting errors when trying to run selects on the database, and I get this error:

Request did not complete within rpc_timeout

When I check the status of the cluster I can see one of my nodes with a cpu usage of over 100% and checking the Cassandra system.log I can see this popping out all the time:

 INFO [ScheduledTasks:1] 2013-06-07 02:02:01,640 StorageService.java (line 3565) Unable to reduce heap usage since there are no dirty column families
 INFO [ScheduledTasks:1] 2013-06-07 02:02:02,642 GCInspector.java (line 119) GC for ConcurrentMarkSweep: 630 ms for 1 collections, 948849672 used; max is 958398464
 WARN [ScheduledTasks:1] 2013-06-07 02:02:02,643 GCInspector.java (line 142) Heap is 0.9900367202591844 full.  You may need to reduce memtable and/or cache sizes.  Cassandra will now flush up to the two largest memtables to free up memory.  Adjust flush_largest_memtables_at threshold in cassandra.yaml if you don't want Cassandra to do this automatically
 INFO [ScheduledTasks:1] 2013-06-07 02:02:02,685 StorageService.java (line 3565) Unable to reduce heap usage since there are no dirty column families
 INFO [ScheduledTasks:1] 2013-06-07 02:02:04,224 GCInspector.java (line 119) GC for ConcurrentMarkSweep: 1222 ms for 2 collections, 931216176 used; max is 958398464
 WARN [ScheduledTasks:1] 2013-06-07 02:02:04,224 GCInspector.java (line 142) Heap is 0.9716378009554072 full.  You may need to reduce memtable and/or cache sizes.  Cassandra will now flush up to the two largest memtables to free up memory.  Adjust flush_largest_memtables_at threshold in cassandra.yaml if you don't want Cassandra to do this automatically
 INFO [ScheduledTasks:1] 2013-06-07 02:02:04,225 StorageService.java (line 3565) Unable to reduce heap usage since there are no dirty column families
 INFO [ScheduledTasks:1] 2013-06-07 02:02:05,226 GCInspector.java (line 119) GC for ConcurrentMarkSweep: 709 ms for 1 collections, 942735576 used; max is 958398464
 WARN [ScheduledTasks:1] 2013-06-07 02:02:05,227 GCInspector.java (line 142) Heap is 0.9836572275641711 full.  You may need to reduce memtable and/or cache sizes.  Cassandra will now flush up to the two largest memtables to free up memory.  Adjust flush_largest_memtables_at threshold in cassandra.yaml if you don't want Cassandra to do this automatically
 INFO [ScheduledTasks:1] 2013-06-07 02:02:05,229 StorageService.java (line 3565) Unable to reduce heap usage since there are no dirty column families
 INFO [ScheduledTasks:1] 2013-06-07 02:02:06,946 GCInspector.java (line 119) GC for ConcurrentMarkSweep: 1271 ms for 2 collections, 939532792 used; max is 958398464
 WARN [ScheduledTasks:1] 2013-06-07 02:02:06,946 GCInspector.java (line 142) Heap is 0.980315419203343 full.  You may need to reduce memtable and/or cache sizes.  Cassandra will now flush up to the two largest memtables to free up memory.  Adjust flush_largest_memtables_at threshold in cassandra.yaml if you don't want Cassandra to do this automatically

Any ideas on how to solve this?

Thanks in advance!

4

2 回答 2

2

看起来 Cassandra JVM 堆大小可能太小了,只有 1Gb:

max is 958398464

假设您的节点上有可用内存,我建议将堆增加到至少 2Gb。

请参阅 cassandra-env.sh 以了解如何计算 JVM 堆分配或手动将其设置为特定值。

于 2013-09-04T17:38:18.900 回答
1

你使用什么类型的分区器,你的数据模式是什么?你有多少条记录,你的查询应该返回多少条记录?这些都是我们应该知道的参数,以找到您问题的正确答案。

以 Cassandra 为例,数据结构设计非常重要,Cassandra 不像 RDBMS 数据库,您可以轻松地在所需的每一列上创建索引,Cassandra 列族必须以在集群节点之间平均分布数据的方式定义以避免热点或仅从一个集群节点读取数据,我认为这可能是您的情况下 rpc 超时的原因。

如果您需要更多信息,请发送更多信息。谢谢

我希望这可以帮助你。

于 2013-06-23T14:35:46.990 回答