mongodb - 为什么显式提示提供更好的性能？

Question

我对索引的工作方式感到有些困惑。如果用键a、b和c的文档填充数据库，每个文档都有随机值（除了c，它有递增值）

这是我使用的python代码：

from pymongo import MongoClient
from random import Random
r = Random()

client = MongoClient("server")
test_db = client.test
fubar_col = test_db.fubar

for i in range(100000):
    doc = {'a': r.randint(10000, 99999), 'b': r.randint(100000, 999999), 'c': i}
    fubar_col.insert(doc)

然后我创建一个索引{c: 1}

现在，如果我表演

>db.fubar.find({'a': {$lt: 50000}, 'b': {$gt: 500000}}, {a: 1, c: 1}).sort({c: -1}).explain()

我有

{
    "cursor" : "BtreeCursor c_1 reverse",
    "isMultiKey" : false,
    "n" : 24668,
    "nscannedObjects" : 100000,
    "nscanned" : 100000,
    "nscannedObjectsAllPlans" : 100869,
    "nscannedAllPlans" : 100869,
    "scanAndOrder" : false,
    "indexOnly" : false,
    "nYields" : 1,
    "nChunkSkips" : 0,
    "millis" : 478,
    "indexBounds" : {
        "c" : [
            [
                {
                    "$maxElement" : 1
                },
                {
                    "$minElement" : 1
                }
            ]
        ]
    },
    "server" : "nuclight.org:27017"
}

看，mongodb 使用c_1索引，执行大约需要 478 毫秒。如果我指定要使用的索引（通过提示（{c：1}））：

> db.fubar.find({'a': {$lt: 50000}, 'b': {$gt: 500000}}, {a: 1, c: 1}).sort({c: -1}).hint({c:1}).explain()

它只需要大约 167 毫秒。为什么会发生？

这是 fubar 集合fubar.tgz的 json 转储的链接

ps 我多次执行这些查询，结果是一样的

score 2 · Accepted Answer

explain强制 MongoDB 重新评估所有查询计划。在“正常”查询中，将使用缓存的最快查询计划。从文档（强调我的）：

该explain()操作评估查询计划集并报告查询的获胜计划。在正常操作中，查询优化器缓存获胜的查询计划，并在未来将它们用于类似的相关查询。因此，MongoDB 有时可能会从缓存中选择与使用 explain().

除非您确实需要为典型查询迭代整个结果集，否则您可能希望将其包含limit()在查询中。在您的特定示例中， usinglimit(100)将返回一个BasicCursorWhen using explain，而不是索引：

> db.fubar.find({'a': {$lt: 50000}, 'b': {$gt: 500000}}).sort({c: -1}).hint({c:1}).limit(100).explain();
{
        "cursor" : "BtreeCursor c_1 reverse",
        "n" : 100,
        "nscanned" : 432,
        "nscannedAllPlans" : 432,
        "scanAndOrder" : false,
        "millis" : 3,
        "indexBounds" : {
                "c" : [[{"$maxElement" : 1}, {"$minElement" : 1}]]
        },
}
>
> db.fubar.find({'a': {$lt: 50000}, 'b': {$gt: 500000}}).sort({c: -1}).limit(100).explain();
{
        "cursor" : "BasicCursor",
        "n" : 100,
        "nscanned" : 431,
        "nscannedAllPlans" : 863,
        "scanAndOrder" : true,
        "millis" : 12,
        "indexBounds" : { },
}

请注意，这是一个有点病态的情况，因为使用索引并没有太大帮助（比较nscanned）。

mongodb - 为什么显式提示提供更好的性能？

1 回答 1

Related

Reference