mongodb - MongoDB $lookup 不使用索引

Question

我正在编写一个需要在两个表之间进行 $lookup 的查询，据我了解，foreignField 必须有一个索引才能及时执行此连接。但是，即使在字段上添加索引后，查询仍会退回到 COLLSCAN。

db.users.aggregate([
  {$lookup:{ from: "transactions", localField: '_id', foreignField: 'uid', as: 'transaction' }},
  { $match: { transaction: { "$size" : 0} } },
  { $count: "total"},
], { explain: true })

这将返回：

"queryPlanner" : {
    "plannerVersion" : 1,
    "namespace" : "test.users",
    "indexFilterSet" : false,
    "parsedQuery" : {

    },
    "winningPlan" : {
        "stage" : "COLLSCAN",
        "direction" : "forward"
    },
    "rejectedPlans" : [ ]
}

正如我所提到的，我确实在 transactions 集合中索引了uid字段：

> db.transactions.getIndexes()
[
    {
        "v" : 1,
        "key" : {
            "_id" : 1
        },
        "name" : "_id_",
        "ns" : "test.transactions"
    },
    {
        "v" : 1,
        "key" : {
            "uid" : 1
        },
        "name" : "uid_1",
        "ns" : "test.transactions"
    }
]

该查询需要几分钟才能在大约 7M 文档的数据库中运行。我正在使用 MongoDB v3.4.7。关于我可能做错了什么的任何想法？提前致谢！

score 5 · Accepted Answer

"stage" : "COLLSCAN",根本不是指的$lookup。

该聚合管道的第一步是从“用户”集合中获取所有文档。由于根本没有为此提供过滤器，因此集合扫描是最有效的方法。

$lookup 阶段应该像任何其他查询一样进行规划，并且可能会使用索引。

score 1 · Accepted Answer

因为您的聚合管道第一阶段没有$match或$sort 或$geoNear查询索引键，并且在 $match 阶段您没有查询任何索引键。

案例 1：如果您在第一阶段对索引键执行 $match，WinningPlan则 stage 将是"FETCH"，stage ofinputStage将是"IXSCAN"

"winningPlan" : {
    "stage" : "FETCH",
    "inputStage" : {
            "stage" : "IXSCAN",
        ...
    }
}

案例 2：如果您在第一阶段对非索引键执行 $match，WinningPlan阶段将是"COLLSCAN"

"winningPlan" : {
    "stage" : "COLLSCAN"
}

案例 3：如果您在查找后对索引键执行 $match（根据您的查询），WinningPlan阶段将是"FETCH"并且inputStage将会是"IXSCAN".

案例 4：如果您在查找后对非索引键执行 $match（只是您做了），WinningPlan阶段将是"COLLSCAN".

对于 7M 记录，您必须在查询中使用索引。不要做太多索引，因为它们将存储在 RAM 中，您不能正确使用$ne或$nin索引键。

Mongodb Docs：优化聚合管道

Mongodb Docs：索引策略

mongodb - MongoDB $lookup 不使用索引

2 回答 2

Related

Reference