cassandra - Cassandra Cql 架构最佳实践

Question

在得到关于二级索引如何在 Cassandra 中工作的很好的解释之后，在这里我再次提出类似的问题？

CREATE TABLE update_audit (
  scopeid bigint,
  formid bigint,
  time timestamp,
  operation int,
  record_id bigint,
  ipaddress text,
  user_id bigint,
  value text,
  PRIMARY KEY ((scopeid), formid, time)
  ) WITH CLUSTERING ORDER BY (formid ASC, time DESC)

仅供参考，操作列的可能值为 1,2 和 3。低基数。

record_link_id高基数。每个条目都可以是唯一的。

根据二级索引在 Cassandra 中的工作原理，user_id是索引的最佳候选者？和cassandra 二级索引的最佳选择。

搜索应该基于

时间限制为 100。
操作和时间限制为 100。
user_id和时间限制为 100。
record_id和时间限制为 100。

问题

总记录超过 10,000M

哪一个是最好的 -在 operation、user_id和record_id上创建索引并应用限制 100。

  1) Does Hidden columnfamily for index operation Will return only 100 results?

  2) More seeks will slow down the fetch operation?

或创建一个新的列族，其定义如下

CREATE TABLE audit_operation_idx (
  scopeid bigint,
  formid bigint,
  operation int,
  time timeuuid,
  PRIMARY KEY ((scopeid), formid, operation, time)
) WITH CLUSTERING ORDER BY (formid ASC, operation ASC, time DESC) 

 required two select query for single select operation.

所以，如果我要为operation创建新的 columnfamily ，user_id和record_id

我必须进行批量查询才能插入这四个列族。

   3) Does TCP problems will come? while executing batch query.because writes will be huge. 
   4) what else should I cover to avoid unnecessary problems.

score 0 · Accepted Answer

有三个选项。

创建一个新表并使用批量插入。如果插入查询的大小变得很大，则必须配置其相关参数。不要担心在 Cassandra 中的写入。
创建具有 where 子句所需列的物化视图。
如果基数低，则创建二级索引。（不建议）

cassandra - Cassandra Cql 架构最佳实践

1 回答 1

Related

Reference