I'm have trouble with slow insert performance on a sharded cluster. My setup consists of 5 shards and each shard has at least 3 replica set members. As far as network topology goes, one group of RS members is living in Rackspace Cloud, the rest are on AWS. Running 2.4.6 on all

I'm processing a file in Java and writing it to MongoDB. Each file is ~60MB and the resulting data for a file ends up as ~160MB in the DB. I'm connecting to a mongos from my Java application. I'm sharding on the hash of the _id (auto-generated ObjectID) and I have write concern set to UNACKNOWLEDGED.

If I write to an unsharded collection I can write the whole file in ~90 seconds. If I write to a sharded collection it's taking me ~20 minutes!

I've done some initial debugging so far:

  • I've tried creating a new collection and writing to it
  • I've tried disabling the balancer to ensure that there were no migrations slowing things down (I've confirmed that the balancer is disabled)
  • Don't see anything strange going on in the mongos or mongod logs

Things I've noticed:

  • The primary node on the primary shard is sitting at almost a constant 80% write lock. The other primaries are hovering around 5% with occasional spikes to 30%. The secondaries are all sitting around 5% with occasional spikes to 15%

  • sh.status() shows even chunk distribution but db.collection.stats() shows that the primary shard has a count & size that's twice as big as the other four shards

  • No other noticeable errors in the logs or MMS

Any ideas on how I can further debug this issue?

Update with output from sh.status()

                    shard key: { "_id" : "hashed" }
                            rs1     8
                            rs2     8
                            rs3     8
                            rs4     8
                            rs0     8
                    too many chunks to print, use verbose if you want to force print

And the output from collection.stats()

mongos> db.collection.stats()
    "sharded" : true,
    "ns" : "prod.collection",
    "count" : 879837,
    "numExtents" : 76,
    "size" : 2210698416,
    "storageSize" : 2653114368,
    "totalIndexSize" : 73526768,
    "indexSizes" : {
            "_id_" : 31526656,
            "_id_hashed" : 42000112
    "avgObjSize" : 2512.6226971586784,
    "nindexes" : 2,
    "nchunks" : 20,
    "shards" : {
            "rs0" : {
                    "ns" : "prod.collection",
                    "count" : 300130,
                    "size" : 754047552,
                    "avgObjSize" : 2512.403131976144,
                    "storageSize" : 873058304,
                    "numExtents" : 17,
                    "nindexes" : 2,
                    "lastExtentSize" : 232005632,
                    "paddingFactor" : 1.0000000000001465,
                    "systemFlags" : 1,
                    "userFlags" : 0,
                    "totalIndexSize" : 24037440,
                    "indexSizes" : {
                            "_id_" : 9753968,
                            "_id_hashed" : 14283472
                    "ok" : 1
            "rs1" : {
                    "ns" : "prod.collection",
                    "count" : 139598,
                    "size" : 350820064,
                    "avgObjSize" : 2513.07371165776,
                    "storageSize" : 470589440,
                    "numExtents" : 15,
                    "nindexes" : 2,
                    "lastExtentSize" : 127299584,
                    "paddingFactor" : 1.000000000000052,
                    "systemFlags" : 1,
                    "userFlags" : 0,
                    "totalIndexSize" : 11626272,
                    "indexSizes" : {
                            "_id_" : 5060944,
                            "_id_hashed" : 6565328
                    "ok" : 1
            "rs2" : {
                    "ns" : "prod.collection",
                    "count" : 149987,
                    "size" : 376944272,
                    "avgObjSize" : 2513.179622233927,
                    "storageSize" : 470593536,
                    "numExtents" : 15,
                    "nindexes" : 2,
                    "lastExtentSize" : 127299584,
                    "paddingFactor" : 1.0000000000000484,
                    "systemFlags" : 1,
                    "userFlags" : 0,
                    "totalIndexSize" : 12713680,
                    "indexSizes" : {
                            "_id_" : 5674144,
                            "_id_hashed" : 7039536
                    "ok" : 1
            "rs3" : {
                    "ns" : "prod.collection",
                    "count" : 140235,
                    "size" : 352293776,
                    "avgObjSize" : 2512.167262095768,
                    "storageSize" : 377905152,
                    "numExtents" : 14,
                    "nindexes" : 2,
                    "lastExtentSize" : 104161280,
                    "paddingFactor" : 1.0000000000000422,
                    "systemFlags" : 1,
                    "userFlags" : 0,
                    "totalIndexSize" : 11863376,
                    "indexSizes" : {
                            "_id_" : 5110000,
                            "_id_hashed" : 6753376
                    "ok" : 1
            "rs4" : {
                    "ns" : "prod.collection",
                    "count" : 149887,
                    "size" : 376592752,
                    "avgObjSize" : 2512.5111050324576,
                    "storageSize" : 460967936,
                    "numExtents" : 15,
                    "nindexes" : 2,
                    "lastExtentSize" : 124985344,
                    "paddingFactor" : 1.000000000000043,
                    "systemFlags" : 1,
                    "userFlags" : 0,
                    "totalIndexSize" : 13286000,
                    "indexSizes" : {
                            "_id_" : 5927600,
                            "_id_hashed" : 7358400
                    "ok" : 1
    "ok" : 1

Balancer status:

mongos> !sh.getBalancerState() && !sh.isBalancerRunning()

0 回答 0