对于为 rev_timestamp 编制索引并具有大约 2 亿条记录的表,我有以下过程。它的工作速度非常快,但是在循环过程中,它每隔约 10 秒就会暂停 10-50 秒。这是为什么?
buf <- mongo.bson.buffer.create()
mongo.bson.buffer.start.object(buf, "rev_timestamp")
mongo.bson.buffer.append(buf, "$lte", 20060201000000)
mongo.bson.buffer.append(buf, "$gte", 20060101000000)
mongo.bson.buffer.finish.object(buf)
query <- mongo.bson.from.buffer(buf)
ns = "enwiki.revision"
no <- mongo.count(mongo, ns, query)
cursor <- mongo.find(mongo, ns, query,,list(rev_user=1L,rev_user_text=1L))
# Convert to table
rev_user <- vector("integer", no)
rev_user_text <- vector("character", no)
i <- 1
while (mongo.cursor.next(cursor)) {
b <- mongo.cursor.value(cursor)
rev_user[i] <- mongo.bson.value(b, "rev_user")
rev_user_text[i] <- mongo.bson.value(b, "rev_user_text")
i <- i + 1
cat(i,"\n")
}
totalusers <- as.data.frame(list(user=rev_user, user_text=rev_user_text))
日志显示查找已完成:
Sat Jun 8 23:26:32.471 [initandlisten] connection accepted from 127.0.0.1:37097 #65 (3 connections now open)
Sat Jun 8 23:26:32.821 [conn65] command enwiki.$cmd command: { count: "revision", query: { rev_timestamp: { $lte: 20060201000000.0, $gte: 20060101000000.0 } } } ntoreturn:1 keyUpdates:0 numYields: 6 locks(micros) r:690301 reslen:48 348ms
Sat Jun 8 23:34:59.163 [conn69] getmore enwiki.revision query: { rev_timestamp: { $lte: 20060201000000.0, $gte: 20060101000000.0 } } cursorid:166545066544259763 ntoreturn:0 keyUpdates:0 numYields: 5 locks(micros) r:565614 nreturned:63397 reslen:4194315 325ms
Sat Jun 8 23:35:25.209 [conn65] getmore enwiki.revision query: { rev_timestamp: { $lte: 20060201000000.0, $gte: 20060101000000.0 } } cursorid:164394032580746079 ntoreturn:0 keyUpdates:0 numYields: 17411 locks(micros) r:88392121 nreturned:63496 reslen:4194290 261829ms
Sat Jun 8 23:35:25.209 [conn69] getmore enwiki.revision query: { rev_timestamp: { $lte: 20060201000000.0, $gte: 20060101000000.0 } } cursorid:166545066544259763 ntoreturn:0 keyUpdates:0 numYields: 824 locks(micros) r:23376882 nreturned:63496 reslen:4194290 20979ms
Sat Jun 8 23:35:25.210 [conn65] SocketException handling request, closing client connection: 9001 socket exception [2] server [127.0.0.1:37097]
Sat Jun 8 23:36:01.980 [initandlisten] connection accepted from 127.0.0.1:39182 #70 (2 connections now open)
Sat Jun 8 23:36:50.724 [conn70] end connection 127.0.0.1:39182 (1 connection now open)
但是脚本卡在第 102 项,并且进度出现在服务器上:
> db.currentOp()
{
"inprog" : [
{
"opid" : 221329005,
"active" : true,
"secs_running" : 156,
"op" : "getmore",
"ns" : "enwiki.revision",
"query" : {
"rev_timestamp" : {
"$lte" : 20060201000000,
"$gte" : 20060101000000
}
},
"client" : "127.0.0.1:37097",
"desc" : "conn65",
"threadId" : "0x7f63e219d700",
"connectionId" : 65,
"waitingForLock" : false,
"numYields" : 7268,
"lockStats" : {
"timeLockedMicros" : {
"r" : NumberLong(105797470),
"w" : NumberLong(0)
},
"timeAcquiringMicros" : {
"r" : NumberLong(156781253),
"w" : NumberLong(0)
}
}
}
]
}
>
索引:
> db.revision.getIndexes()
[
{
"v" : 1,
"key" : {
"_id" : 1
},
"ns" : "enwiki.revision",
"name" : "_id_"
},
{
"v" : 1,
"key" : {
"rev_timestamp" : 1
},
"ns" : "enwiki.revision",
"name" : "rev_timestamp_1"
},
{
"v" : 1,
"key" : {
"rev_timestamp" : 1,
"rev_user" : 1
},
"ns" : "enwiki.revision",
"name" : "rev_timestamp_1_rev_user_1"
}
]
>
这是正常行为吗。据我所知,使用游标检索密钥应该很快。最初的发现应该只需要很长时间。
insert query update delete getmore command flushes mapped vsize res faults locked db idx miss % qr|qw ar|aw netIn netOut conn time
*0 *0 *0 *0 0 1|0 0 86g 172g 3.76g 92 enwiki:0.0% 0 0|0 1|0 62b 2k 2 00:20:06
*0 *0 *0 *0 0 1|0 0 86g 172g 3.76g 58 enwiki:0.0% 0 0|0 1|0 62b 2k 2 00:20:07
*0 *0 *0 *0 0 1|0 0 86g 172g 3.76g 65 enwiki:0.0% 0 0|0 1|0 62b 2k 2 00:20:08
*0 *0 *0 *0 0 1|0 0 86g 172g 3.76g 63 enwiki:0.0% 0 0|0 1|0 62b 2k 2 00:20:10
*0 *0 *0 *0 0 1|0 0 86g 172g 3.76g 84 enwiki:0.0% 0 0|0 1|0 62b 2k 2 00:20:11
*0 *0 *0 *0 0 1|0 0 86g 172g 3.76g 93 enwiki:0.0% 0 0|0 1|0 62b 2k 2 00:20:12
*0 *0 *0 *0 0 1|0 0 86g 172g 3.76g 84 enwiki:0.0% 0 0|0 1|0 62b 2k 2 00:20:13
*0 *0 *0 *0 0 1|0 0 86g 172g 3.76g 64 enwiki:0.0% 0 0|0 1|0 62b 2k 2 00:20:14
*0 *0 *0 *0 0 1|0 0 86g 172g 3.76g 85 enwiki:0.0% 0 0|0 1|0 62b 2k 2 00:20:15
*0 *0 *0 *0 0 1|0 0 86g 172g 3.76g 90 enwiki:0.0% 0 0|0 1|0 62b 2k 2 00:20:16