cassandra - 在 CQL3 中为 cassandra“表”选择正确的模式

Question

我们正在尝试将特定 profile_id 的许多属性存储在一个表中（使用 CQL3）并且无法围绕哪种方法最好：

一个。创建表 mytable (profile_id, a1 int, a2 int, a3 int, a4 int ... a3000 int) 主键 (profile_id);

或者

湾。创建许多表，例如。创建表 mytable_a1(profile_id, value int) 主键 (profile_id); 创建表 mytable_a2(profile_id, value int) 主键 (profile_id); ... 创建表 mytable_a3000(profile_id, value int) 主键 (profile_id);

或者

C。创建表 mytable (profile_id, a_all text) 主键 (profile_id); 并在 a_all 中存储 3000 个“列”，例如：插入 mytable (profile_id, a_all) 值 (1, "a1:1,a2:5,a3:55, ....a3000:5");

或者

d。以上都不是

我们将在此表上运行的查询类型： select * from mytable where profile_id in (1,2,3,4,5423,44)

我们尝试了第一种方法，查询一直超时，有时甚至会杀死 cassandra 节点。

score 2 · Accepted Answer

答案是使用聚类列。聚类列允许您创建可用于保存属性名称（col name）及其值（col value）的动态列。

该表将是

create table mytable ( 
    profile_id text,
    attr_name text,
    attr_value int,
    PRIMARY KEY(profile_id, attr_name)
)

这允许您添加插入，例如

insert into mytable (profile_id, attr_name, attr_value) values ('131', 'a1', 3);
insert into mytable (profile_id, attr_name, attr_value) values ('131', 'a2', 1031);
.....
insert into mytable (profile_id, attr_name, attr_value) values ('131', 'an', 2);

这将是最佳解决方案。

因为您随后想要执行以下“我们将在此表上运行的查询类型：select * from mytable where profile_id in (1,2,3,4,5423,44)”

这将需要 6 个查询，但 cassandra 应该能够立即执行此操作，特别是如果您有一个多节点集群。

此外，如果您使用 DataStax Java 驱动程序，您可以在集群上异步并发运行此请求。

有关数据建模和 DataStax Java 驱动程序的更多信息，请查看 DataStax 的免费在线培训。值得一看 http://www.datastax.com/what-we-offer/products-services/training/virtual-training

希望能帮助到你。

cassandra - 在 CQL3 中为 cassandra“表”选择正确的模式

1 回答 1

Related

Reference