cassandra - Cassandra has a limit of 2 billion cells per partition, but what's a partition?

Question

In Cassandra Wiki, it is said that there is a limit of 2 billion cells (rows x columns) per partition. But it is unclear to me what is a partition?

Do we have one partition per node per column family, which would mean that the max size of a column family would be 2 billion cells * number of nodes in the cluster.

Or will Cassandra create as much partitions as required to store all the data of a column family?

I am starting a new project so I will use Cassandra 2.0.

score 66 · Accepted Answer

随着 CQL3 的出现，术语与旧的节俭术语略有不同。

基本上

Create Table foo (a int , b int, c int, d int, PRIMARY KEY ((a,b),c))

将制作一个 CQL3 表。a 和 b 中的信息用于制作分区键，这描述了信息将驻留在哪个节点上。这就是 20 亿小区限制中所说的“分区”。

在该分区内，信息将由 c 组织，称为集群键。a、b 和 c 一起定义了 d 的唯一值。在这种情况下，分区中的单元数将为 c * d。所以在这个例子中，对于任何给定的 a 和 b 对，c 和 d 的组合只能有 20 亿个

因此，当您对数据进行建模时，您希望确保主键会发生变化，以便您的数据将随机分布在 Cassandra 中。然后使用集群键来确保您的数据以您想要的方式可用。

观看此视频以获取有关 cassandra 中的数据建模的更多信息数据模型已死，数据模型万岁

编辑：评论中的另一个例子

Create Table foo (a int , b int, c int, d int, e int, f int, PRIMARY KEY ((a,b),c,d))

分区将由 a 和 b 的组合唯一标识。

在分区内 c 和 d 将用于对分区内的单元格进行排序，因此布局看起来有点像：

(a1,b1) --> [c1,d1 : e1], [c1,d1  :f1], [c1,d2 : e2] ....

所以在这个例子中，你可以有 20 亿个单元格，每个单元格包含：

c 的值
d 的值
e 或 f 的值

所以 20 亿的限制是指和的唯一元组的(c,d,e)总和(c,d,f)。

score 4 · Accepted Answer

来自： http ://www.datastax.com/documentation/cql/3.0/cql/cql_reference/create_table_r.html

使用复合分区键¶

复合分区键是由多个列组成的分区键。您使用一组额外的括号将组成复合分区键的列括起来。主键定义内但嵌套括号外的列是聚簇列。这些列在分区内形成逻辑集以方便检索。</p>

CREATE TABLE Cats (
  block_id uuid,
  breed text,
  color text,
  short_hair boolean,
  PRIMARY KEY ((block_id, breed), color, short_hair)
);

例如，复合分区键由block_id 和breed 组成。聚类列 color 和 short_hair 确定数据的聚类顺序。一般来说，Cassandra 会将具有相同 block_id 但不同品种的列存储在不同的节点上，而具有相同 block_id 和品种的列会存储在同一节点上。

含义

==> 分区是最小的复制单元（它本身就使 sh** 没有意义。:)）

==> block_id 和breed 的每个组合都是一个分区。

==> 在集群中的任何给定机器上，将存在具有相同分区键的所有行或不存在。

cassandra - Cassandra has a limit of 2 billion cells per partition, but what's a partition?

2 回答 2

编辑：评论中的另一个例子

含义

Related

Reference