sql - 提高多列多行数据库表的查询性能（50列，5mm行）

Question

我们正在为我们的用户数据构建一个缓存解决方案。数据当前存储在 sybase 中，分布在 5 - 6 个表中，但查询服务使用休眠建立在它之上，我们的性能非常差。为了将数据加载到缓存中，需要 10 到 15 个小时。

所以我们决定在另一个关系数据库 (UDB) 中创建一个 50 - 60 列和 5mm 行的非规范化表，首先填充该表，然后使用 JDBC 从新的非规范化表中填充缓存，这样构建缓存的时间会更短. 这给了我们更好的性能，现在我们可以在大约一个小时内构建缓存，但这也不能满足我们在 5 分钟内构建缓存的要求。使用以下查询查询非规范化表

select * from users where user id in (...)

这里用户 id 是主键。我们还尝试了一个查询

select * from user where user_location in (...)

并在位置上创建了一个非唯一索引，但这也无济于事。

那么有没有一种方法可以使查询更快。如果没有，那么我们也愿意考虑一些 NOSQL 解决方案。

哪种 NOSQL 解决方案适合我们的需求。除了大桌子，我们每天都会在桌子上进行大约 1 毫米的更新。

我已经阅读了有关 mongo db 的信息，似乎它可能会起作用，但没有人发布过任何使用 mongo db 的经验，其中包含如此多的行和如此多的每日更新。

请让我们知道您的想法。

score 4 · Accepted Answer

The short answer here, relating to MongoDB, is yes - it can be used in this way to create a denormalized cache in front of an RDBMS. Others have used MongoDB to store datasets of similar (and larger) sizes to the one you described, and can keep a dataset of that size in RAM. There are some details missing here in terms of your data, but it is certainly not beyond the capabilities of MongoDB and is one of the more frequently used implementations:

http://www.mongodb.org/display/DOCS/The+Database+and+Caching

The key will be the size of your working data set and therefore your available RAM (MongoDB maps data into memory). For larger solutions, write heavy scaling, and similar issues, there are numerous approaches (sharding, replica sets) that can be employed.

With the level of detail given it is hard to say for certain that MongoDB will meet all of your requirements, but given that others have already done similar implementations and based on the information given there is no reason it will not work either.

sql - 提高多列多行数据库表的查询性能（50列，5mm行）

1 回答 1

Related

Reference