我会创建 2 个存储桶:新闻和产品。然后我会在每个存储桶中的键前面加上客户端名称。我可能还会在新闻键中包含日期以便于日期范围。
news/acme_2011-02-23_01
news/acme_2011-02-23_02
news/bigcorp_2011-02-21_01
并且可以选择在产品名称前加上类别名称
products/acme_blacksmithing_anvil
products/bigcorp_databases_oracle
然后在您的 map/reduce 中,您可以使用密钥过滤:
// BigCorp News items
{
"inputs":{
"bucket":"news",
"key_filters":[["starts_with", "bigcorp"]]
}
// ... rest of mapreduce job
}
// Acme Blacksmithing items
{
"inputs":{
"bucket":"products",
"key_filters":[["starts_with", "acme_blacksmithing"]]
}
// ... rest of mapreduce job
}
// News for all clients from Feb 12th to 19th
{
"inputs":{
"bucket":"news",
"key_filters":[["tokenize", "_", 2],
["between", "2011-02-12", "2011-02-19"]]
}
// ... rest of mapreduce job
}