cassandra - Cassandra how many columns/row for optimal performance?

Question

I am writing a chat server and, want to store my messages in cassandra. Because I need range queries and I know that I will expect 100 messages/day and maintain history for 6 months I will have 18000 messages for a user at a point.

Now, since I'll do range queries I need my data to be on the same machine. Either I have to use ByteOrderPartitioner, which I don't understand fully, or I can store all the message for a user on the same row.

create table users_conversations(jid1 bigint, jid2 bigint, archiveid timeuuid, stanza text, primary key((jid1, jid2), archiveid)) with CLUSTERING ORDER BY (archiveid DESC );

So I'll have 18000 columns. Do you think I'll have performance problems using this cluster key approach?

If yes, what alternative do I have?

Thanks

score 2 · Accepted Answer

不要使用 ByteOrderedPartitioner。我怎么强调这一点的重要性都不为过。

因为我会进行范围查询，所以我需要我的数据在同一台机器上。

使用您的 PRIMARY KEY 定义如下：

primary key((jid1, jid2), archiveid)

您当前的分区键 (jid1和jid2) 将被组合和散列，以便特定值的所有消息jid1和jid2一起存储在同一分区上。缺点是每个查询都需要jid1和jid2。但是它们将被排序archiveid，您将能够按范围查询archiveid，并且只要您没有达到每个分区 20 亿列的限制，它应该会表现良好。

cassandra - Cassandra how many columns/row for optimal performance?

1 回答 1

Related

Reference