如果您有多余的计算能力并且事先知道您需要什么密钥,您当然可以使用 Riak 的 MapReduce,但通常检索密钥并在客户端上运行您的处理将同样快(并且不会使您的集群紧张)。
一些一般的想法:
- 将数据汇总成更大的块
- 如果您担心客户端在缓冲数据时崩溃而丢失数据,您可以随时在数据到达时存储数据
- Similar idea: store the data as it arrives, then retrieve it and roll it up at certain intervals
- You can automatically expire data once you're confident it is being reliably stored in larger blocks, using either the Bitcask or Memory backends
- Memory backend is quite useful (RAM permitting) for any data that only needs to be stored for a limited period of time
- Related: don't be afraid to store multiple copies of your data to make reading/reporting easier later
- Multiple chunks of time (5- and 15-minute blocks, for example)
- Multiple report formats
Having said all that, if you're doing straight key/value requests (it's ideal to always be able to compute the keys you need, rather than doing indexing or searching), Riak can support very heavy traffic loads, so I wouldn't recommend spending too much time creating alternative storage mechanisms unless you know you're going to face latency problems.