cassandra - get slice returning inconsistent results

Question

I'm using a wide column index to order records in a timeline fashion, a la:

"TimelineIndex" //CF name
  [CFName] //row key
    [TimeUUID]:[CFRowKey] //column name/value
    [TimeUUID]:[CFRowKey] //column name/value
    [TimeUUID]:[CFRowKey] //column name/value
    [TimeUUID]:[CFRowKey] //column name/value

Assume I have 10 records in the TimelineIndex CF with one column per day, ranging from '01/01/2013 12:00:00' to '10/01/2013 12:00:00' (as TimeUUIDs), and I run the following get_slice() command:

var predicate = new SlicePredicate(){ Slice_range = new SliceRange() {
{
  Start = TimeGenerator.GetTimeUUID(new DateTime("06/01/2013 12:00:00"),
  Finish = TimeGenerator.GetTimeUUID(new DateTime("11/01/2013 12:00:00"),
  Count = 5,
  Reversed = false
}};
var results = client.get_slice([CFName], parent, predicate, consitencylevel.one);

The columns returned by this query aren't always consistent. The majority of the time the column named '06/01/2013 12:00:00' is returned, but every so often (about 1 in 10 executions) that column is excluded from the results and I end up with only 4 columns returned.

I cant for the life of me figure out why I would be getting inconsistent results here. Can any suggest a reason for this?

And before anyone says, I know its not advisable to use Thrift directly - this is purely a proof of concept exercise!

score 4 · Accepted Answer

At the risk of belaboring the obvious, remember that TimeUUIDs (version 1 UUIDs) serve two purposes:

They have a time-based component
They are UUIDs

Thus, you can insert multiple time-based data values and get them back chronologically without worrying about losing data due to column name collisions.

Also remember that column names must be globally ordered for Cassandra to find your data correctly, and UUIDs are no exception. Thus, if you give Cassandra two TimeUUIDs with the same time component, it will order them based on the non-time-components.

So, what's happening is a subtle interaction of the above two points: when you create new random-ish TimeUUIDs at 06/01/2013 12:00:00, sometimes that sorts before the one you inserted, and sometimes it does not. When it does not, then the first column won't be included.

To fix this you'd need to deliberately construct non-time components for the query UUID to sort as low as possible. The pycassa library does this, for instance.

score 0 · Accepted Answer

It appears that your issue may be related to your consistency level. You have 2 replicas, yet you are reading with consistency level of ONE. If you also write with ONE, you will run into issues as you describe. If you change your read level to QUORUM (or LOCAL_QUORUM), my guess is your data will never disappear. Sporadically disappearing data is almost always a consistency issue.

Why does this happen?

Using your setup of 3 nodes with RF=2, let's say you write column A with CL=ONE. Now you have one node (let's say N1) with column A, and the other node that would theoretically get the replica (let's say N2) does not yet have it. So you'll end up with this:

N1: has A
N2: does not have A
N3: will look to N1 or N2 for A

So now, let's see what would happen if, using CL=ONE, you ask each node for A:

N1: you get A
N2: you get nothing because it doesn't check with any other nodes
N3: you may get A or nothing, depending on whether the request gets handled by N1 or N2

If you read with CL=QUORUM:

N1: you get A, and N2 gets updated due to repair on read
N2: you get A, because it checks against N1 and repairs
N3: you get A, because both N1 and N2 will reliably return it

You can easily check to see if this is your issue by using QUORUM reads. If so, the problem will not reappear.

cassandra - get slice returning inconsistent results

2 回答 2

Related

Reference