我是 map reduce 概念的新手,尽管我进展缓慢,但我发现了一些需要帮助的问题。
我有一个简单的集合,由 id、城市和目的地组成,如下所示:
{ "_id" : "5230e7e00000000000000000", "city" : "Boston", "to" : "Chicago" },
{ "_id" : "523fe7e00000000000000000", "city" : "New York", "to" : "Miami" },
{ "_id" : "5240e1e00000000000000000", "city" : "Boston", "to" : "Miami" },
{ "_id" : "536fe4e00000000000000000", "city" : "Washington D.C.", "to" : "Boston" },
{ "_id" : "53ffe7e00000000000000000", "city" : "New York", "to" : "Boston" },
{ "_id" : "5740e1e00000000000000000", "city" : "Boston", "to" : "Miami" },
...
(请注意,此数据仅供参考)
我想按城市对目的地进行分组,包括计数:
{ "city" : "Boston", values : [{"Chicago",1}, {"Miami",2}] }
{ "city" : "New York", values : [{"Miami",1}, {"Boston",1}] }
{ "city" : "Washington D.C.", values : [{"Boston", 1}] }
为此,我开始使用此功能进行映射:
function() {
emit(this.city, this.to);
}
它执行预期的分组。我的减少功能是这样的:
function(key, values) {
var reduced = {"to":[]};
for (var i in values) {
var item = values[i];
reduced.to.push(item);
}
return reduced;
}
这给出了一些预期的输出:
{ "_id" : ObjectId("522f8a9181f01e671a853adb"), "value" : { "to" : [ "Boston", "Miami" ] } }
{ "_id" : ObjectId("522f933a81f01e671a853ade"), "value" : { "to" : [ "Chicago", "Miami", "Miami" ] } }
{ "_id" : ObjectId("5231f0ed81f01e671a853ae0"), "value" : "Boston" }
如您所见,我仍然没有计算重复的城市,但是从上面可以看出,由于某种原因,输出中的最后一个结果看起来不太好。我希望它是
{ "_id" : ObjectId("5231f0ed81f01e671a853ae0"), "value" : { "to" : ["Boston"] } }
这与只有一个项目有关吗?有没有办法获得这个?
谢谢你。