我们有一个 mongoDb 集群,有 3 个 shard,每个 shard 是一个包含 3 个节点的副本集,我们使用的 mongoDb 版本是 3.2.6。我们有一个大小约为 230G 的大型数据库,其中包含大约 5500 个集合。我们发现大约 2300 个集合不平衡,而其他 3200 个集合均匀分布到 3 个分片。

下面是 sh.status 的结果(整个结果太大了,我只贴一部分):

mongos> sh.status()
--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("57557345fa5a196a00b7c77a")
    {  "_id" : "shard1",  "host" : "shard1/," }
    {  "_id" : "shard2",  "host" : "shard2/," }
    {  "_id" : "shard3",  "host" : "shard3/," }
  active mongoses:
    "3.2.6" : 1
    Currently enabled:  yes
    Currently running:  yes
        Balancer lock taken at Sat Sep 03 2016 09:58:58 GMT+0800 (CST) by iZ23vbzyrjiZ:27017:1467949335:-2109714153:Balancer
    Collections with active migrations: 
        bdtt.normal_20131017 started at Sun Sep 18 2016 17:03:11 GMT+0800 (CST)
    Failed balancer rounds in last 5 attempts:  0
    Migration Results for the last 24 hours: 
        1490 : Failed with error 'aborted', from shard2 to shard3
        1490 : Failed with error 'aborted', from shard2 to shard1
        14 : Failed with error 'data transfer error', from shard2 to shard1
    {  "_id" : "bdtt",  "primary" : "shard2",  "partitioned" : true }
            shard key: { "_id" : "hashed" }
            unique: false
            balancing: true
                shard2  142
            too many chunks to print, use verbose if you want to force print
            shard key: { "_id" : "hashed" }
            unique: false
            balancing: true
                shard1  36
                shard2  42
                shard3  46
            too many chunks to print, use verbose if you want to force print
            shard key: { "_id" : "hashed" }
            unique: false
            balancing: true
                shard1  34
                shard2  32
                shard3  32
            too many chunks to print, use verbose if you want to force print
            shard key: { "_id" : "hashed" }
            unique: false
            balancing: true
                shard1  30
                shard2  32
                shard3  32
            too many chunks to print, use verbose if you want to force print
            shard key: { "_id" : "hashed" }
            unique: false
            balancing: true
                shard2  126
            too many chunks to print, use verbose if you want to force print
            shard key: { "_id" : "hashed" }
            unique: false
            balancing: true
                shard2  118
            too many chunks to print, use verbose if you want to force print

集合“normal_20160913”不平衡,我在下面发布了这个集合的 getShardDistribution() 结果:

mongos> db.normal_20160913.getShardDistribution()

Shard shard2 at shard2/,
 data : 4.77GiB docs : 203776 chunks : 118
 estimated data per chunk : 41.43MiB
 estimated docs per chunk : 1726

 data : 4.77GiB docs : 203776 chunks : 118
 Shard shard2 contains 100% data, 100% docs in cluster, avg obj size on shard : 24KiB

balancer 进程处于运行状态,chunksize 为默认(64M):

mongos> sh.isBalancerRunning()
mongos> use config
switched to db config
mongos> db.settings.find()
{ "_id" : "chunksize", "value" : NumberLong(64) }
{ "_id" : "balancer", "stopped" : false }


2016-09-19T14:25:25.427+0800 I SHARDING [conn37136926] moveChunk result: { ok: 0.0, errmsg: "Not starting chunk migration because another migration is already in progress", code: 117 }
2016-09-19T14:25:59.620+0800 I SHARDING [conn37136926] moveChunk result: { ok: 0.0, errmsg: "Not starting chunk migration because another migration is already in progress", code: 117 }
2016-09-19T14:25:59.644+0800 I SHARDING [conn37136926] moveChunk result: { ok: 0.0, errmsg: "Not starting chunk migration because another migration is already in progress", code: 117 }
2016-09-19T14:35:02.701+0800 I SHARDING [conn37136926] moveChunk result: { ok: 0.0, errmsg: "Not starting chunk migration because another migration is already in progress", code: 117 }
2016-09-19T14:35:02.728+0800 I SHARDING [conn37136926] moveChunk result: { ok: 0.0, errmsg: "Not starting chunk migration because another migration is already in progress", code: 117 }
2016-09-19T14:42:18.232+0800 I SHARDING [conn37136926] moveChunk result: { ok: 0.0, errmsg: "Not starting chunk migration because another migration is already in progress", code: 117 }
2016-09-19T14:42:18.256+0800 I SHARDING [conn37136926] moveChunk result: { ok: 0.0, errmsg: "Not starting chunk migration because another migration is already in progress", code: 117 }
2016-09-19T14:42:27.101+0800 I SHARDING [conn37136926] moveChunk result: { ok: 0.0, errmsg: "Not starting chunk migration because another migration is already in progress", code: 117 }
2016-09-19T14:42:27.112+0800 I SHARDING [conn37136926] moveChunk result: { ok: 0.0, errmsg: "Not starting chunk migration because another migration is already in progress", code: 117 }
2016-09-19T14:43:41.889+0800 I SHARDING [conn37136926] moveChunk result: { ok: 0.0, errmsg: "Not starting chunk migration because another migration is already in progress", code: 117 }

我尝试手动使用 moveChunk 命令,它返回相同的错误:

mongos> sh.moveChunk("bdtt.normal_20160913", {_id:ObjectId("57d6d107edac9244b6048e65")}, "shard3")
    "cause" : {
        "ok" : 0,
        "errmsg" : "Not starting chunk migration because another migration is already in progress",
        "code" : 117
    "code" : 117,
    "ok" : 0,
    "errmsg" : "move failed"

我不确定是否创建了太多导致迁移不堪重负的集合?每天将创建大约 60-80 个新集合。


  1. 为什么有些收藏不平衡,是不是跟新创建的收藏多有关?
  2. 是否有任何命令可以检查处理迁移作业的详细信息?我得到了很多错误日志,显示一些迁移慢跑正在运行,但我找不到正在运行的。

2 回答 2


回答我自己的问题:最后我们找到了根本原因,这与“ MongoDB balancer timeout with delay replica ”完全相同,由异常的副本集配置引起。当这个问题发生时,我们的副本集配置如下:

shard1:PRIMARY> rs.conf()
    "_id" : "shard1",
    "version" : 3,
    "protocolVersion" : NumberLong(1),
    "members" : [
            "_id" : 0,
            "host" : "",
            "arbiterOnly" : false,
            "buildIndexes" : true,
            "hidden" : false,
            "priority" : 1,
            "tags" : {

            "slaveDelay" : NumberLong(0),
            "votes" : 1
            "_id" : 1,
            "host" : "",
            "arbiterOnly" : false,
            "buildIndexes" : true,
            "hidden" : false,
            "priority" : 1,
            "tags" : {

            "slaveDelay" : NumberLong(0),
            "votes" : 1
            "_id" : 2,
            "host" : "",
            "arbiterOnly" : true,
            "buildIndexes" : true,
            "hidden" : false,
            "priority" : 1,
            "tags" : {

            "slaveDelay" : NumberLong(0),
            "votes" : 1
            "_id" : 3,
            "host" : "",
            "arbiterOnly" : false,
            "buildIndexes" : true,
            "hidden" : true,
            "priority" : 0,
            "tags" : {

            "slaveDelay" : NumberLong(86400),
            "votes" : 1
    "settings" : {
        "chainingAllowed" : true,
        "heartbeatIntervalMillis" : 2000,
        "heartbeatTimeoutSecs" : 10,
        "electionTimeoutMillis" : 10000,
        "getLastErrorModes" : {

        "getLastErrorDefaults" : {
            "w" : 1,
            "wtimeout" : 0
        "replicaSetId" : ObjectId("5755464f789c6cd79746ad62")

副本集中有 4 个节点:1 个主节点、1 个从节点、1 个仲裁节点和 1 个 24 小时延迟从节点。这使得 3 个节点占多数,因为仲裁器没有数据存在,平衡器需要等待延迟的从属设备来满足写入问题(确保接收器分片已收到块)。


于 2016-10-07T03:19:51.730 回答



  • 一次一个块:MongoDB 块迁移以队列机制发生,一次只迁移一个块。
  • 平衡器锁定:平衡器锁定信息可能会让您对正在迁移的内容有更多了解。您还应该能够在 mongos 日志文件中看到日志条目是块迁移。




于 2016-09-20T01:25:48.260 回答