mongodb - 在 mongoDB 中进行更新聚合

Question

我有一个包含许多类似结构化文档的集合，其中两个文档看起来像

输入：

{ 
    "_id": ObjectId("525c22348771ebd7b179add8"), 
    "cust_id": "A1234", 
    "score": 500, 
    "status": "A"
    "clear": "No"
}

{ 
    "_id": ObjectId("525c22348771ebd7b179add9"), 
    "cust_id": "A1234", 
    "score": 1600, 
    "status": "B"
    "clear": "No"
}

默认情况下clear，所有文档为"No",

要求：我必须将所有具有相同的文档的分数相加cust_id，前提是它们属于status "A"和status "B"。如果score超过了2000，那么我必须clear为"Yes"所有具有相同cust_id.

预期输出：

{ 
    "_id": ObjectId("525c22348771ebd7b179add8"), 
    "cust_id": "A1234", 
    "score": 500, 
    "status": "A"
    "clear": "Yes"
}

{
    "_id": ObjectId("525c22348771ebd7b179add9"), 
    "cust_id": "A1234", 
    "score": 1600, 
    "status": "B"
    "clear": "Yes"
}

是的，因为 1600+500 = 2100，并且 2100 > 2000。

我的方法：我只能通过聚合函数得到总和，但更新失败

db.aggregation.aggregate([
    {$match: {
        $or: [
            {status: 'A'},
            {status: 'B'}
        ]
    }},
    {$group: {
        _id: '$cust_id',
        total: {$sum: '$score'}
    }},
    {$match: {
        total: {$gt: 2000}
    }}
])

请建议我如何进行。

score 16 · Accepted Answer

经过很多麻烦，试验 mongo shell 我终于找到了我的问题的解决方案。

伪代码：

# To get the list of customer whose score is greater than 2000
cust_to_clear=db.col.aggregate(
    {$match:{$or:[{status:'A'},{status:'B'}]}},
    {$group:{_id:'$cust_id',total:{$sum:'$score'}}},
    {$match:{total:{$gt:500}}})

# To loop through the result fetched from above code and update the clear
cust_to_clear.result.forEach
(
   function(x)
   { 
     db.col.update({cust_id:x._id},{$set:{clear:'Yes'}},{multi:true}); 
   }
)

如果您对同一问题有任何不同的解决方案，请发表评论。

score 8 · Accepted Answer

在 Mongo 4.2 中，现在可以使用update with aggregation pipeline来做到这一点。示例 2 提供了如何进行条件更新的示例：

db.runCommand(
   {
      update: "students",
      updates: [
         {
           q: { },
           u: [
                 { $set: { average : { $avg: "$tests" } } },
                 { $set: { grade: { $switch: {
                                       branches: [
                                           { case: { $gte: [ "$average", 90 ] }, then: "A" },
                                           { case: { $gte: [ "$average", 80 ] }, then: "B" },
                                           { case: { $gte: [ "$average", 70 ] }, then: "C" },
                                           { case: { $gte: [ "$average", 60 ] }, then: "D" }
                                       ],
                                       default: "F"
                 } } } }
           ],
           multi: true
         }
      ],
      ordered: false,
      writeConcern: { w: "majority", wtimeout: 5000 }
   }
)

另一个例子：

db.c.update({}, [
  {$set:{a:{$cond:{
    if: {},    // some condition
      then:{} ,   // val1
      else: {}    // val2 or "$$REMOVE" to not set the field or "$a" to leave existing value
  }}}}
]);

score 6 · Accepted Answer

您需要分两步执行此操作：

识别cust_id总分大于 200 的客户 ( )
对于这些客户中的每一个，设置clear为Yes

对于第一部分，您已经有了一个很好的解决方案。第二部分应该作为update()对数据库的单独调用来实现。

伪代码：

# Get list of customers using the aggregation framework
cust_to_clear = db.col.aggregate(
    {$match:{$or:[{status:'A'},{status:'B'}]}},
    {$group:{_id:'$cust_id', total:{$sum:'$score'}}},
    {$match:{total:{$gt:2000}}}
    )

# Loop over customers and update "clear" to "yes"
for customer in cust_to_clear:
    id = customer[_id]
    db.col.update(
        {"_id": id},
        {"$set": {"clear": "Yes"}}
    )

这并不理想，因为您必须为每个客户进行数据库调用。如果您需要经常执行此类操作，您可能会修改您的架构以包含每个文档中的总分。（这必须由您的应用程序维护。）在这种情况下，您可以使用单个命令进行更新：

db.col.update(
    {"total_score": {"$gt": 2000}},
    {"$set": {"clear": "Yes"}},
    {"multi": true}
    )

score 3 · Accepted Answer

在 MongoDB 2.6 中，可以使用相同的命令编写聚合查询的输出。

更多信息在这里：http ://docs.mongodb.org/master/reference/operator/aggregation/out/

score 3 · Accepted Answer

简短答案：为避免循环数据库查询，只需将$merge添加到末尾并指定您的集合，如下所示：

db.aggregation.aggregate([
    {$match: {
        $or: [
            {status: 'A'},
            {status: 'B'}
        ]
    }},
    {$group: {
        _id: '$cust_id',
        total: {$sum: '$score'}
    }},
    {$match: {
        total: {$gt: 2000}
    }},
    { $merge: "<collection name here>"}
])

详细说明：当前的解决方案是循环数据库查询，这在时间效率方面并不好，而且代码也更多。Mitar 的答案不是通过聚合进行更新，而是相反 => 在 Mongo 的更新中使用聚合。如果您想知道这样做有什么好处，那么您可以使用所有聚合管道，而不是仅限于文档中指定的少数几个。

这是一个不适用于 Mongo 更新的聚合示例：

db.getCollection('foo').aggregate([
  { $addFields: {
      testField: {
        $in: [ "someValueInArray", '$arrayFieldInFoo']
      } 
  }},
  { $merge : "foo" }]
)

这将输出带有新测试字段的更新集合，如果“someValueInArray”在“arrayFieldInFoo”中，则该字段为真，否则为假。目前这在Mongo.update中是不可能的，因为 $in 不能在更新聚合中使用。

更新：从 $out 更改为 $merge 因为 $out 仅在更新整个集合时才有效，因为 $out 将整个集合替换为聚合的结果。$merge 只有在聚合匹配文档时才会覆盖（更安全）。

mongodb - 在 mongoDB 中进行更新聚合

5 回答 5

Related

Reference