5

I have a NoSQL database that we are using for data processing, as it can be used for my application faster than SQL can. I'm treating our NoSQL database almost like a cache of information, with the SQL being the authority of data, and the NoSQL store being updated with changes. Right now this is being done through our application, so when a request comes in for a change, it is made in the SQL database, and the NoSQL database. This is failing at times as sometimes the NoSQL update fails, or other situations cause the NoSQL database to get out of sync.

I could do a batch update every X minutes, however it is a lot of information in the data stores, and it would take hours to ensure that they are in sync. We have some timestamps to do a difference of what has been changed, but this is not always accurate.

I'm wondering what some recommended strategy for keeping a data store(secondary database cache) in sync with my main store are?

4

1 回答 1

5

我知道我过去在消息传递方面已经做到了这一点 - 特别是 JMS 和 ActiveMQ。我会使用队列将更新发送到 NoSQL 存储(Mongo)。通过这种方式,消息可以在队列中累积,如果与 NoSQL 存储的连接被切断,它可以从中断的地方继续。

它运行得非常好,因为 ActiveMQ 非常稳定且易于使用。

我总是看到像你提到的那样使用差异来完成这项工作。您全面引入日期字段,然后跟踪最新同步。这种方法的好处是它可以轻松地让您通过修改最后同步日期来重放事务。

最后一条建议……围绕将数据从 A 点泵送到 B 点(在本例中为 SQL 到 NoSQL)编写好的工具。在我上一份工作中,我编写了几个工具来从 SQL 批量加载 NoSQL 存储,如果任何事情真的不同步,它会让生活变得轻松。在脚本和批量加载过程之间,我总是可以恢复。

于 2013-08-01T17:36:05.350 回答