通过 C# 使用 MongoDB 聚合框架时,我遇到了性能问题。当使用 C# 执行时,通过 Mongo shell 快速工作的聚合需要永远。
在尝试通过 C# 调用框架之前,我通过 Mongo shell 执行了以下聚合以检查一切是否正常:
db.runCommand(
    {
        aggregate: "actions", 
        pipeline : 
        [
            { $match : { CustomerAppId : "f5357224-b1a8-4f1a-8ea2-a06a00ca597a", ActionName : "install"}}, 
            { $group : { _id : { CustomerAppId:"$CustomerAppId",ActionDate:"$ActionDate" }, count : { $sum : 1 } }}
        ]
    });
该脚本在 < 500 毫秒内执行并返回预期的大约 200 个结果(CustomerAppId 在数据库中定义为字符串。不能将 GUID 与聚合框架一起使用。)。
然后,我将相同的脚本移植到 C#:
var pipeline = new BsonArray
        {
            new BsonDocument
                {
                    {
                        "$match", 
                        new BsonDocument
                            {
                                {"CustomerAppId", "f5357224-b1a8-4f1a-8ea2-a06a00ca597a"},
                                {"ActionName", "install"}
                            }
                    },
                    { "$group", 
                        new BsonDocument
                            {
                                { "_id", new BsonDocument
                                             {
                                                 {
                                                     "CustomerAppId","$CustomerAppId"
                                                 },
                                                 {
                                                     "ActionName","$ActionName"
                                                 }
                                             } 
                                },
                                {
                                    "Count", new BsonDocument
                                                 {
                                                     {
                                                         "$sum", 1
                                                     }
                                                 }
                                }
                            } 
                  }
            }
        };
var command = new CommandDocument
{
    { "aggregate", "actions" },
    { "pipeline", pipeline }
};
(如果有更简单的方法在 C# 中编写聚合,请告诉我:))
我正在执行这样的操作:
var result = db.RunCommand(command);
问题是它会杀死服务器:CPU 和内存使用率上升。当我检查 db.currentOp() 时,我可以看到聚合操作,但我最终必须使用 db.killOp(1281546) 来终止它:
"opid" : 1281546,
"active" : true,
"secs_running" : 294,
"op" : "query",
"ns" : "database.actions",
"query" : {
        "aggregate" : "actions",
        "pipeline" : [
                {
                        "$match" : {
                                "CustomerAppId" : "f5357224-b1a8-4f1a-8ea2-a06a00ca597a",
                                "ActionName" : "install"
                        },
                        "$group" : {
                                "_id" : {
                                        "CustomerAppId" : "$CustomerAppId",
                                        "ActionName" : "$ActionName"
                                },
                                "Count" : {
                                        "$sum" : 1
                                }
                        }
                }
        ]
},
对我来说,该操作看起来非常好,并且类似于我直接从 mongo shell 运行的脚本。感觉就像通过 C# 运行聚合会导致 MongoDB 错过索引,并且它正在对集合中的所有约 600 万个文档进行表扫描。
有任何想法吗?
更新:日志
感谢 cirrus 的建议,我启用了详细日志记录,然后使用 tail 获取查询。他们是不同的!所以我认为我的 C# 端口有问题。关于如何正确格式化查询的任何想法?
通过shell执行时的查询:
Mon Oct  8 15:00:13 [conn1] run command database.$cmd { aggregate: "actions", pipeline: [ { $match: { CustomerAppId: "f5357224-b1a8-4f1a-8ea2-a06a00ca597a", ActionName: "install" } }, { $group: { _id: { CustomerAppId: "$CustomerAppId", ActionDate: "$ActionDate" }, count: { $sum: 1.0 } } } ] }
Mon Oct  8 15:00:13 [conn1] command database.$cmd command: { aggregate: "actions", pipeline: [ { $match: { CustomerAppId: "f5357224-b1a8-4f1a-8ea2-a06a00ca597a", ActionName: "install" } }, { $group: { _id: { CustomerAppId: "$CustomerAppId", ActionDate: "$ActionDate" }, count: { $sum: 1.0 } } } ] } ntoreturn:1 keyUpdates:0 locks(micros) r:27944 reslen:12705 29ms
以及通过 C# 执行时的查询:
Mon Oct  8 15:00:16 [conn8] run command database.$cmd { aggregate: "actions", pipeline: [ { $match: { CustomerAppId: "f5357224-b1a8-4f1a-8ea2-a06a00ca597a", ActionName: "install" }, $group: { _id: { CustomerAppId: "$CustomerAppId", ActionDate: "$ActionDate" }, Count: { $sum: 1 } } } ] }
第二行丢失了,我想是因为查询没有完成。
为了便于比较,这里再次列出日志。脚本启动,C# 停止:
Mon Oct  8 15:00:13 [conn1] run command database.$cmd { aggregate: "actions", pipeline: [ { $match: { CustomerAppId: "f5357224-b1a8-4f1a-8ea2-a06a00ca597a", ActionName: "install" } }, { $group: { _id: { CustomerAppId: "$CustomerAppId", ActionDate: "$ActionDate" }, count: { $sum: 1.0 } } } ] }
Mon Oct  8 15:00:16 [conn8] run command database.$cmd { aggregate: "actions", pipeline: [ { $match: { CustomerAppId: "f5357224-b1a8-4f1a-8ea2-a06a00ca597a", ActionName: "install" }, $group: { _id: { CustomerAppId: "$CustomerAppId", ActionDate: "$ActionDate" }, Count: { $sum: 1 } } } ] }