database-design - Cassandra: Which schema for table-like mappings?

Question

I have tried different approaches but I can't find a solution to my problem: My data is table-like, meaning I have one data point (float) for each combination of inputs from a set of Strings:

(a mapping of S × S → ℝ )

I want to model the schema so that I can do the following lookups:

all pairs of strings with a value in a certain range
for a given input String, all Strings for which the mapped value is in certain range
for a given combination of input Strings the mapped value

Since the mapping is symmetrical (m(x,y) == m(y,x) ), it would be great if I only had to store the
n*(n+1) / 2 unique values instead of the n^2 total mappings.

What I have tried so far:

S1+" "+S2 as row key and the value as column name
S1 as row key and a Composite key of [S2:value] as column name
S1 as row key, S2 as column name, value as column value.

but unfortunately, all these approaches don't let me do all the queries I need. Is this even possible in Cassandra?

score 0 · Accepted Answer

Cassandra does not support your first query --- all pairs of strings with a value in a certain range --- since currently, Cassandra only allows range queries with at least one EQ on the WHERE clause. However, your second and third queries is doable :)

Example

Consider the following example:

cqlsh:so> desc table string_mappings;
CREATE TABLE string_mappings (
  s1 ascii,
  s2 ascii,
  value float,
  PRIMARY KEY (s1, s2, value)
)

and we have the following tuples:

cqlsh:so> select * from string_mappings;

 s1    | s2    | value
-------+-------+-------
 hello | hello |     1
 hello | world |   0.2
 stack | hello |     0
 stack | stack |     1
 stack | world |     0
 world | world |     1

Your first query does not work as Cassandra currently not support range queries without an EQ on the WHERE clause:

cqlsh:so> select * from string_mappings where value>0.5;
Bad Request: PRIMARY KEY part value cannot be restricted (preceding part s2 is either not restricted or by a non-EQ relation)

However, the following range query (your second query) is fine since it has an EQ:

cqlsh:so> select * from string_mappings where value > 0.5 and s2='hello' allow filtering;

 s1    | s2    | value
-------+-------+-------
 hello | hello |     1

and remember to put the ALLOW FILTERING keyword, or you will get the following error:

cqlsh:so> select * from string_mappings where value > 0.5 and s2='hello';
Bad Request: Cannot execute this query as it might involve data filtering and thus may have unpredictable performance. If you want to execute this query despite the performance unpredictability, use ALLOW FILTERING

Finally, your third query is also not a problem :)

cqlsh:so> select * from string_mappings where S1='hello' and S2='world';

 s1    | s2    | value
-------+-------+-------
 hello | world |   0.2

database-design - Cassandra: Which schema for table-like mappings?

1 回答 1

Related

Reference