1

We have a huge Redis database containing about 100 million keys, which maps phone numbers to hashes of data.

Once in a while all this data needs to be aggregated and saved to an SQL database. During aggregation we need to iterate over all the stored keys, and take a look at those arrays.

Using Redis.keys is not a good option because it will retrieve and store the whole list of keys in memory, and it take a loooong time to complete. We need something that will give back an enumerator that can be used to iterate over all the keys, like so:

redis.keys_each { |k| agg(k, redis.hgetall(k)) }

Is this even possible with Redis?

This would prevent Ruby from constructing an array of 100 million elements in memory, and would probably be way faster. Profiling shows us that using the Redis.keys command makes Ruby hog the CPU at 100%, but the Redis process seems to be idle.

I know that using keys is discouraged against building a set from the keys, but even if we construct a set out of the keys, and retrieve that using smembers, we'll be having the same problem.

4

1 回答 1

4

当前 Redis 版本无法增量枚举所有键。

您可以直接转储数据库 (bgsave) 并将结果转储转换为 json 文件,以便使用您想要的任何 Ruby 工具进行处理,而不是尝试提取实时 Redis 实例的所有密钥。

https://github.com/sripathikrishnan/redis-rdb-tools

或者,您可以使用 redis-rdb-tools API 直接在 Python 中编写解析器并提取所需的数据(无需生成 json 文件)。

于 2013-07-24T12:29:58.967 回答