我有一个包含 3 个 MongoDB 实例的副本集。这些实例具有 8GB 的 RAM 和双核 2.27 GHz CPU。所有实例都运行 2.2.2 版(我在 2.0.1 中看到了相同的行为)。
这是我的问题:我们的主实例(副本集的主实例)最近养成了每 2 天爬到 100% CPU 的习惯。追查原因,我决定运行 MongoDB 分析器。我发现了数百个非常慢的查询。这是一个例子:
> db.system.profile.find()
{
"ts" : ISODate("2012-12-16T20:31:39.078Z"),
"op" : "command",
"ns" : "stylesaint.$cmd",
"command" : {
"count" : "tears",
"query" : {
"_id" : { "$gt" : ObjectId("50cdeadeaf58d3de96000294") },
"active" : true,
"is_image_processed" : true,
"hidden_from_feed" : false,
"hidden_from_public_feeds" : false
},
"fields" : null
},
"ntoreturn" : 1,
"responseLength" : 48,
"millis" : 13930,
"client" : "#########"
}
根据我对 mongodb 的了解,在这些情况下,自然的下一步是尝试对这些查询进行解释()。但是,explain() 并没有解释查询的缓慢:
> db.tears.find({ "_id" : { "$gt" : ObjectId("50cdeadeaf58d3de96000294") }, "active" : true, "is_image_processed" : true, "hidden_from_feed" : false, "hidden_from_public_feeds" : false }).explain()
{
"cursor" : "BtreeCursor id",
"isMultiKey" : false,
"n" : 4,
"nscannedObjects" : 5,
"nscanned" : 5,
"nscannedObjectsAllPlans" : 23,
"nscannedAllPlans" : 25,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 0,
"indexBounds" : {
"_id" : [
[
ObjectId("50cdeadeaf58d3de96000294"),
ObjectId("ffffffffffffffffffffffff")
]
]
},
"server" : "#########"
}
扫描 5 个文档不应花费 13 秒。正在发生的其他事情正在减慢查询速度。也许其他一些查询正在耗尽服务器的资源?但是,我不知道在哪里看。感谢您提供任何建议。
MongoDB 日志
我在启动过程中找不到任何警告:
***** SERVER RESTARTED *****
Sun Dec 16 21:02:56 [initandlisten] MongoDB starting : pid=...
Sun Dec 16 21:02:56 [initandlisten] db version v2.2.2, pdfile version 4.5
Sun Dec 16 21:02:56 [initandlisten] git version: ...
Sun Dec 16 21:02:56 [initandlisten] build info: Linux 2.6.21.7-2 ...
Sun Dec 16 21:02:56 [initandlisten] options: { config: "/etc/mongodb.conf", dbpath: "/data/mongodb", logappend: "true", logpath: "/var/log/mongodb/mongodb.log", replSet: "...", rest: "true" }
Sun Dec 16 21:02:56 [initandlisten] journal dir=/data/mongodb/journal
Sun Dec 16 21:02:56 [initandlisten] recover : no journal files present, no recovery needed
Sun Dec 16 21:02:56 [initandlisten] waiting for connections on port ...
Sun Dec 16 21:02:56 [websvr] admin web console waiting for connections on port ...
Sun Dec 16 21:02:56 [initandlisten] connection accepted from ...
Sun Dec 16 21:02:56 [conn1] end connection ... (0 connections now open)
Sun Dec 16 21:02:56 [initandlisten] connection accepted from ... #2 (1 connection now open)
Sun Dec 16 21:02:56 [rsStart] replSet I am ...
Sun Dec 16 21:02:56 [rsStart] replSet STARTUP2
Sun Dec 16 21:02:56 [rsHealthPoll] replSet member ... is up
Sun Dec 16 21:02:56 [rsHealthPoll] replSet member ... is now in state SECONDARY
Sun Dec 16 21:02:57 [initandlisten] connection accepted from ... #3 (2 connections now open)
Sun Dec 16 21:02:57 [rsSync] replSet SECONDARY
Sun Dec 16 21:02:58 [initandlisten] connection accepted from ... #4 (3 connections now open)
Sun Dec 16 21:02:58 [initandlisten] connection accepted from ... #5 (4 connections now open)
Sun Dec 16 21:02:58 [conn5] end connection ... (3 connections now open)
Sun Dec 16 21:02:58 [rsHealthPoll] replSet member ... is up
Sun Dec 16 21:02:58 [rsHealthPoll] replSet member ... is now in state PRIMARY
Sun Dec 16 21:02:59 [initandlisten] connection accepted from ... #6 (4 connections now open)
Sun Dec 16 21:03:00 [initandlisten] connection accepted from ... #7 (5 connections now open)
Sun Dec 16 21:03:02 [conn7] end connection ... (4 connections now open)
Sun Dec 16 21:03:03 [rsBackgroundSync] replSet syncing to: ...
Sun Dec 16 21:03:04 [rsSyncNotifier] replset setting oplog notifier to ...
Sun Dec 16 21:03:06 [conn2] end connection ... (3 connections now open)
Sun Dec 16 21:03:06 [initandlisten] connection accepted from ... #8 (4 connections now open)
Sun Dec 16 21:03:08 [initandlisten] connection accepted from ... #9 (5 connections now open)
Sun Dec 16 21:03:13 [initandlisten] connection accepted from ... #10 (6 connections now open)
Sun Dec 16 21:03:13 [conn10] end connection ... (5 connections now open)
Sun Dec 16 21:03:13 [initandlisten] connection accepted from ... #11 (6 connections now open)
Sun Dec 16 21:03:15 [conn3] end connection ... (5 connections now open)
Sun Dec 16 21:03:16 [rsHealthPoll] replSet member .... is now in state SECONDARY
Sun Dec 16 21:03:16 [rsMgr] replSet info electSelf 1
Sun Dec 16 21:03:16 [rsMgr] replSet PRIMARY
回复:请求更多信息
目前,MongoDB运行正常;没有超过 100 毫秒的查询。一旦 100% CPU 再次发生,我将发布有关系统资源的更多信息。