mongodb - mongodb 不使用 $exists 和 $elemMatch 的索引

Question

我的文档结构如下所示

{
    "_id" : "311acd33a0ae8dcc3101246f90af9dc5",
    "created_datetime" : ISODate("2013-04-05T10:35:31.143Z"),
    "installs" : [
        {
            "status" : 1,
            "app" : "xyz",
            "reg_id" : "AVJyaIFI2Q8v93YmOHI5kEOVoCLbd4CAUyVK9zLrC1QCiBcl_bw89i5PvhEuTKmxtb4x130vjMyo78zPI7cedErcRv_Jjn0BN3Wq40hhg",
            "last_action_datetime" : ISODate("2013-04-05T10:35:31.143Z"),
            "version" : "2"
        },
        {
            "status" : 1,
            "app" : "abc",                                                
            "reg_id" : "AVJyaIFI2Q8v93YmOHI5kEOVoCLbd4CAUyVK9zLrC1QCiBcl_bw89i5PvhEuTKmxtb4x130vjMyo78zPI7cedErcRv_Jjn0BN3Wq40hhg",
            "last_action_datetime" : ISODate("2013-04-05T10:35:31.143Z"),
            "version" : "5"
        },
        {
            "status" : 1,
            "app" : "pqr",                                                
            "last_action_datetime" : ISODate("2013-04-06T10:35:31.143Z"),
            "version" : "1"
        },
    ],
    "last_update" : ISODate("2013-04-12T06:26:46.333Z"),
    "num_updates" : 9,
    .....
}

我有一个复合索引'install.reg_id'和'installs.status'一个单一的索引'installs.status'

现在我想找到至少installs包含一个元素reg_id并且它status是1的所有文档。所以我查询

db.users.find({'installs': {'$elemMatch': {'reg_id': {'$exists':  true}, 'status': 1}}}).explain()

我明白了

{
        "cursor" : "BtreeCursor installs.status_1",
        "isMultiKey" : true,
        "n" : 1447034,
        "nscannedObjects" : 1720864,
        "nscanned" : 1720864,
        "nscannedObjectsAllPlans" : 1720864,
        "nscannedAllPlans" : 1720864,
        "scanAndOrder" : false,
        "indexOnly" : false,
        "nYields" : 13072,
        "nChunkSkips" : 0,
        "millis" : 11063,
        "indexBounds" : {
                "installs.status" : [
                        [
                                1,
                                1
                        ]
                ]
        },
        "server" : "####:27017"
}

所以这里应该使用复合索引但没有使用。我认为这$elemMatch是罪魁祸首所以我做了这个查询

db.users.find({'installs.reg_id': {'$exists':  true}}).explain()

我明白了

{
        "cursor" : "BasicCursor",
        "isMultiKey" : false,
        "n" : 2947446,
        "nscannedObjects" : 3184871,
        "nscanned" : 3184871,
        "nscannedObjectsAllPlans" : 3184871,
        "nscannedAllPlans" : 3184871,
        "scanAndOrder" : false,
        "indexOnly" : false,
        "nYields" : 23865,
        "nChunkSkips" : 0,
        "millis" : 16172,
        "indexBounds" : {

        },
        "server" : "####:27017"
}

这表明查询没有使用任何索引。

知道这里出了什么问题吗？

更新：添加提示确实使查询使用索引

db.users.find({'installs': {'$elemMatch': {'reg_id': {'$exists':  true}, 'status': 1}}}).hint({"installs.reg_id":1,"installs.status":1}).explain()

返回

{
        "cursor" : "BtreeCursor installs.reg_id_1_installs.status_1",
        "isMultiKey" : true,
        "n" : 1451589,
        "nscannedObjects" : 2464985,
        "nscanned" : 4373261,
        "nscannedObjectsAllPlans" : 2464985,
        "nscannedAllPlans" : 4373261,
        "scanAndOrder" : false,
        "indexOnly" : false,
        "nYields" : 20170,
        "nChunkSkips" : 0,
        "millis" : 106353,
        "indexBounds" : {
                "installs.reg_id" : [
                        [
                                {
                                        "$minElement" : 1
                                },
                                {
                                        "$maxElement" : 1
                                }
                        ]
                ],
                "installs.status" : [
                        [
                                1,
                                1
                        ]
                ]
        },
        "server" : "####:27017"
}

这里使用复合索引。

score 4 · Accepted Answer

没有什么问题。查询优化器正在选择提供更好性能/选择性的索引。

您可以通过“提示”查询使用您希望它使用的索引并比较它需要扫描多少元素和文档以找到它需要返回的内容来确认这一点。

查看您的解释，我可以看到 reg_id 存在于您希望查询使用的索引中超过 92.5% 的索引条目中。这不是很有选择性。使用您希望它使用的索引只会将 3.1M 文档/条目缩小到 2.9M - 不是很好。

使用 status_1 索引，它立即将“候选人”缩小到 1.7M，现在通过所有这些，它发现 1.4M 有 reg_id。

拥有更多选择性索引是关键，但不要忘记，在这种情况下，您要求它返回 140 万个文档，因此当需要扫描这么多文档时，很难有选择性。

另一件事是相等，对于索引（甚至不相等）来说，这样的操作比 {$exists} 更有效。甚至 {$ne:null} 也会比 $exists 更好 - 一般来说，依赖使用 $exists 甚至不等式的查询来提高性能并不是一个好主意，比如相等或更小范围的查询可以（使用索引时） .

更多信息可以在这里找到：http: //docs.mongodb.org/manual/applications/indexes/，特别是在这里：http ://docs.mongodb.org/manual/tutorial/create-queries-that-ensure-选择性/

score 1 · Accepted Answer

我有同样的问题。它似乎是针对 2.7 (Due:01/Aug/14) 版本的已记录错误：

https://jira.mongodb.org/browse/SERVER-2348

mongodb - mongodb 不使用 $exists 和 $elemMatch 的索引

2 回答 2

Related

Reference