mongodb - MongoDb 查询需要很长时间

Question

我有一个向量集合集合的大小

print vectors.count()

是

102020

当我遍历字段时

start = time.time()
for v in vectors.find({},{'vector' : 1, '_id' : 0}):
    pass
print "total time:" , end-start

总时间：5.05100011826

but when I run with explain() I see that the query takes substantially less time.

print vectors.find({},{'vector' : 1, '_id' : 0}).explain()

{u'nYields': 0, u'allPlans': [{u'cursor': u'BasicCursor', u'indexBounds': {}}], u'nChunkSkips': 0, u'millis': 23, u'n': 102020, u'cursor': u'BasicCursor', u'indexBounds': {}, u'nscannedObjects': 102020, u'isMultiKey': False, u'indexOnly': False, u'nscanned': 102020}

Why is there such a huge time difference? Is there anyway to speed this up? I loaded all of the vectors to a sql DB text field and the same query was less than one second. Thanks

score 1 · Accepted Answer

我的猜测是，第二个仅向您展示 mongoDB 实际执行“查找”需要多快，而前者还涉及将每条记录检索到控制台并处理它们。

score 0 · Accepted Answer

You might want to play with batch_size to improve the speed and reduce the amount of network hops when iterating through results.

start = time.time()
for v in vectors.find({},{'vector' : 1, '_id' : 0}).batch_size(1000):
    pass
print "total time:" , end-start

score 0 · Accepted Answer

您可以为要查询的字段提供索引，在您的情况下，它是"vector"：

vectors.createIndex({"vector":1},{sparse:true})

然后您可以查看查询时间。

mongodb - MongoDb 查询需要很长时间

3 回答 3

Related

Reference