4

鉴于此 MongoDB 集合:

[
  { character: 'broquaint', race: 'Halfling', class: 'Hunter' },
  { character: 'broquaint', race: 'Halfling', class: 'Hunter' },
  { character: 'broquaint', race: 'Halfling', class: 'Rogue' },
  { character: 'broquaint', race: 'Naga',     class: 'Fighter' },
  { character: 'broquaint', race: 'Naga',     class: 'Hunter' }
]

我想统计每场比赛和班级,即

{
  race:  { 'Halfling': 3, 'Naga': 2 },
  class: { 'Hunter': 3, 'Rogue': 1, 'Fighter': 1 }
}

我一直在尝试使用聚合框架(以替换现有的 map/reduce)来做到这一点,但只能获得组合的计数,即

{ '_id': { race: 'Halfling', class: 'Hunter' },  count: 2 }
{ '_id': { race: 'Halfling', class: 'Rogue' }    count: 1 }
{ '_id': { race: 'Naga',     class: 'Fighter' }, count: 1 }
{ '_id': { race: 'Naga',     class: 'Hunter' },  count: 1 }

这很简单,可以以编程方式减少到所需的结果,但我希望能够将其留给 MongoDB。

作为参考,这是我到目前为止的代码:

db.games.aggregate(
  { '$match': { character: 'broquaint' } },
  {
    '$group': {
      _id:   { race: '$race', background: '$background'},
      count: { '$sum': 1 }
    }
  }
)

所以问题是 - 鉴于示例集合我可以纯粹通过 MongoDB 的聚合框架达到我想要的输出吗?

对于任何可能提供的帮助,请提前非常感谢!

4

2 回答 2

3

从 MongoDB 3.4 开始,这可以通过使用多个聚合管道和$facet.

取自文档

$分面

在同一组输入文档的单个阶段内处理多个聚合管道。每个子管道在输出文档中都有自己的字段,其结果存储为文档数组。

因此,对于您的用例,这将通过以下方式实现:

const aggregatorOpts = [
    { $match: { character: 'broquaint' } }, // Match the character
    {
        // Seperate into 2 or more pipes that will count class and
        // race seperatly
        $facet: {
            race: [
                // Group by race and get the count:
                // [
                //   {
                //     _id: 'Halfling',
                //     count: 3
                //   }
                //   {
                //     _id: 'Naga',
                //     count: 2
                //   }
                // ]

                // $sortByCount is the same as
                // { $group: { _id: <expression>, count: { $sum: 1 } } },
                // { $sort: { count: -1 } }

                { $sortByCount: '$race' },

                // Now we want to transform the array in to 1 document,
                // where the '_id' field is the key, and the 'count' is the value.
                // To achieve this we will use $arrayToObject. According the the
                // docs, we have to first rename the fields to 'k' for the key,
                // and 'v' for the value. We use $project for this:
                {
                    $project: {
                        _id: 0,
                        k: '$_id',
                        v: '$count',
                    },
                },
            ],
            // Same as above but for class instead
            class: [
                { $sortByCount: '$class' },
                {
                    $project: {
                        _id: 0,
                        k: '$_id',
                        v: '$count',
                    },
                },
            ],
        },
    },
    {
        // Now apply the $arrayToObject for both class and race.
        $addFields: {
            // Will override the existing class and race arrays
            // with their respective object representation instead.
            class: { $arrayToObject: '$class' },
            race: { $arrayToObject: '$race' },
        },
    },
];

db.races.aggregate(aggregatorOpts)

产生以下内容:

[
  {
    "race": {
      "Halfling": 3,
      "Naga": 2
    },
    "class": {
      "Hunter": 3,
      "Rogue": 1,
      "Fighter": 1,
    }
  }
]

如果您对@Asya 提供的输出格式感到满意,那么您可以删除$project$addFields阶段,并将$sortByCount部分留在每个子管道中。

有了这些新功能,聚合更容易通过额外的计数来扩展,只需在$facet. 计算子组甚至更容易一些,但这将是一个单独的问题。

于 2017-12-03T13:41:32.137 回答
2

是的,您可以使用聚合框架来做到这一点。它不会很漂亮,但它仍然会比使用 mapreduce 快得多......

简而言之(输出与您提供的格式不同但内容相同的格式):

> group1 = {
    "$group" : {
        "_id" : "$race",
        "class" : {
            "$push" : "$class"
        },
        "count" : {
            "$sum" : 1
        }
    }
};
> unwind = { "$unwind" : "$class" };
> group2 = {
    "$group" : {
        "_id" : "$class",
        "classCount" : {
            "$sum" : 1
        },
        "races" : {
            "$push" : {
                "race" : "$_id",
                "raceCount" : "$count"
            }
        }
    }
};
> unwind2 = { "$unwind" : "$races" };
> group3 ={
    "$group" : {
        "_id" : 1,
        "classes" : {
            "$addToSet" : {
                "class" : "$_id",
                "classCount" : "$classCount"
            }
        },
        "races" : {
            "$addToSet" : "$races"
        }
    }
};
> db.races.aggregate(group1, unwind, group2, unwind2, group3);
{
    "result" : [
        {
            "_id" : 1,
            "classes" : [
                {
                    "class" : "Fighter",
                    "classCount" : 1
                },
                {
                    "class" : "Hunter",
                    "classCount" : 3
                },
                {
                    "class" : "Rogue",
                    "classCount" : 1
                }
            ],
            "races" : [
                {
                    "race" : "Naga",
                    "raceCount" : 2
                },
                {
                    "race" : "Halfling",
                    "raceCount" : 3
                }
            ]
        }
    ],
    "ok" : 1
}
于 2013-05-20T04:42:30.193 回答