概括
我们的团队继承了这个使用 Cassandra 实现的序列生成器;
桌子
CREATE TABLE IF NOT EXISTS sequences (
id_name varchar,
next_id bigint,
instance_name varchar,
PRIMARY KEY (id_name)
)WITH COMPRESSION = { ... };
GET_LOCK("UPDATE sequences USING TTL 10 set instance_name = ? where id_name = ? IF instance_name = null", ConsistencyLevel.LOCAL_QUORUM),
SELECT_SEQUENCE("SELECT next_id from sequences where id_name = ?",
ConsistencyLevel.LOCAL_QUORUM)
UPDATE_SEQUENCE("UPDATE sequences SET next_id= ? where id_name= ? IF next_id= ?",ConsistencyLevel.LOCAL_QUORUM),
REMOVE_LOCK("UPDATE sequences set instance_name = null where id_name = ? IF instance_name = ?", ConsistencyLevel.LOCAL_QUORUM);
(note: ConsistencyLevel was set to LOCAL_SERIAL in Java)
它运行良好,直到昨天,我们发现两个不同的 java App 节点具有相同的序列号
发生这种情况的时间戳
AppNode 1
getlock: 4:25:14.480
UpdateSequence: 4:25:14.486
AppNode 2
getlock: 4:25:14,489
UpdateSequence: 4:25:14,496
这怎么可能发生?我们怎样才能知道到底发生了什么?