我们有一个相对简单的 MongoDB 分片设置:4 个分片,每个分片是一个至少有 3 个成员的副本集。每个集合都包含从大量文件中加载的数据;每个文件都有一个单调递增的 ID,并且分片是基于 ID 的哈希完成的。
我们的大多数系列都按预期工作。但是,我有一个似乎没有在分片之间正确分布块的集合。在创建索引并对其进行分片之前,该集合已加载约 30GB 的数据,但据我所知,这无关紧要。以下是该系列的统计数据:
mongos> db.mycollection.stats()
{
        "sharded" : true,
        "ns" : "prod.mycollection",
        "count" : 53304954,
        "numExtents" : 37,
        "size" : 35871987376,
        "storageSize" : 38563958544,
        "totalIndexSize" : 8955712416,
        "indexSizes" : {
                "_id_" : 1581720784,
                "customer_code_1" : 1293148864,
                "job_id_1_customer_code_1" : 1800853936,
                "job_id_hashed" : 3365576816,
                "network_code_1" : 914412016
        },
        "avgObjSize" : 672.9578525853339,
        "nindexes" : 5,
        "nchunks" : 105,
        "shards" : {
                "rs0" : {
                        "ns" : "prod.mycollection",
                        "count" : 53304954,
                        "size" : 35871987376,
                        "avgObjSize" : 672.9578525853339,
                        "storageSize" : 38563958544,
                        "numExtents" : 37,
                        "nindexes" : 5,
                        "lastExtentSize" : 2146426864,
                        "paddingFactor" : 1.0000000000050822,
                        "systemFlags" : 0,
                        "userFlags" : 0,
                        "totalIndexSize" : 8955712416,
                        "indexSizes" : {
                                "_id_" : 1581720784,
                                "job_id_1_customer_code_1" : 1800853936,
                                "customer_code_1" : 1293148864,
                                "network_code_1" : 914412016,
                                "job_id_hashed" : 3365576816
                        },
                        "ok" : 1
                }
        },
        "ok" : 1
}
这个集合的 sh.status() :
            prod.mycollection
                    shard key: { "job_id" : "hashed" }
                    chunks:
                            rs0     105
                    too many chunks to print, use verbose if you want to force print
关于为什么这个集合只会分发到 rs0,我有什么遗漏吗?有没有办法强制重新平衡?我执行了相同的步骤来分片其他集合,并且它们正确地分布自己。以下是成功分片的集合的统计信息:
mongos> db.myshardedcollection.stats()
{
        "sharded" : true,
        "ns" : "prod.myshardedcollection",
        "count" : 5112395,
        "numExtents" : 71,
        "size" : 4004895600,
        "storageSize" : 8009994240,
        "totalIndexSize" : 881577200,
        "indexSizes" : {
                "_id_" : 250700688,
                "customer_code_1" : 126278320,
                "job_id_1_customer_code_1" : 257445888,
                "job_id_hashed" : 247152304
        },
        "avgObjSize" : 783.3697513591966,
        "nindexes" : 4,
        "nchunks" : 102,
        "shards" : {
                "rs0" : {
                        "ns" : "prod.myshardedcollection",
                        "count" : 1284540,
                        "size" : 969459424,
                        "avgObjSize" : 754.7133012595949,
                        "storageSize" : 4707762176,
                        "numExtents" : 21,
                        "nindexes" : 4,
                        "lastExtentSize" : 1229475840,
                        "paddingFactor" : 1.0000000000000746,
                        "systemFlags" : 0,
                        "userFlags" : 0,
                        "totalIndexSize" : 190549856,
                        "indexSizes" : {
                                "_id_" : 37928464,
                                "job_id_1_customer_code_1" : 39825296,
                                "customer_code_1" : 33734176,
                                "job_id_hashed" : 79061920
                        },
                        "ok" : 1
                },
                "rs1" : {
                        "ns" : "prod.myshardedcollection",
                        "count" : 1287243,
                        "size" : 1035438960,
                        "avgObjSize" : 804.384999568846,
                        "storageSize" : 1178923008,
                        "numExtents" : 17,
                        "nindexes" : 4,
                        "lastExtentSize" : 313208832,
                        "paddingFactor" : 1,
                        "systemFlags" : 0,
                        "userFlags" : 0,
                        "totalIndexSize" : 222681536,
                        "indexSizes" : {
                                "_id_" : 67787216,
                                "job_id_1_customer_code_1" : 67345712,
                                "customer_code_1" : 30169440,
                                "job_id_hashed" : 57379168
                        },
                        "ok" : 1
                },
                "rs2" : {
                        "ns" : "prod.myshardedcollection",
                        "count" : 1131411,
                        "size" : 912549232,
                        "avgObjSize" : 806.5585644827565,
                        "storageSize" : 944386048,
                        "numExtents" : 16,
                        "nindexes" : 4,
                        "lastExtentSize" : 253087744,
                        "paddingFactor" : 1,
                        "systemFlags" : 0,
                        "userFlags" : 0,
                        "totalIndexSize" : 213009328,
                        "indexSizes" : {
                                "_id_" : 64999200,
                                "job_id_1_customer_code_1" : 67836272,
                                "customer_code_1" : 26522944,
                                "job_id_hashed" : 53650912
                        },
                        "ok" : 1
                },
                "rs3" : {
                        "ns" : "prod.myshardedcollection",
                        "count" : 1409201,
                        "size" : 1087447984,
                        "avgObjSize" : 771.6769885914075,
                        "storageSize" : 1178923008,
                        "numExtents" : 17,
                        "nindexes" : 4,
                        "lastExtentSize" : 313208832,
                        "paddingFactor" : 1,
                        "systemFlags" : 0,
                        "userFlags" : 0,
                        "totalIndexSize" : 255336480,
                        "indexSizes" : {
                                "_id_" : 79985808,
                                "job_id_1_customer_code_1" : 82438608,
                                "customer_code_1" : 35851760,
                                "job_id_hashed" : 57060304
                        },
                        "ok" : 1
                }
        },
        "ok" : 1
}
sh.status() 用于正确分片的集合:
            prod.myshardedcollection
                    shard key: { "job_id" : "hashed" }
                    chunks:
                            rs2     25
                            rs1     26
                            rs3     25
                            rs0     26
                    too many chunks to print, use verbose if you want to force print