我们知道初始数据大小(120GB),并且我们知道 MongoDB 中默认的最大块大小是 64MB。如果我们将 64MB 分成 120GB,我们得到 1920 - 所以这是我们应该开始寻找的最小块数。碰巧 2048 恰好是 16 除以 2 的幂,并且鉴于 GUID(我们的分片键)是基于十六进制的,这比 1920 更容易处理(见下文)。
注意:必须在将任何数据添加到集合之前完成此预拆分。如果您在包含数据的集合上使用 enableSharding() 命令,MongoDB 将自行拆分数据,然后您将在块已经存在时运行此命令 - 这可能导致非常奇怪的块分布,所以要小心。
出于此答案的目的,我们假设将调用数据库并且调用users
集合userInfo
。我们还假设 GUID 将被写入该_id
字段。使用这些参数,我们将连接到 amongos
并运行以下命令:
// first switch to the users DB
use users;
// now enable sharding for the users DB
sh.enableSharding("users");
// enable sharding on the relevant collection
sh.shardCollection("users.userInfo", {"_id" : 1});
// finally, disable the balancer (see below for options on a per-collection basis)
// this prevents migrations from kicking off and interfering with the splits by competing for meta data locks
sh.stopBalancer();
现在,根据上面的计算,我们需要将 GUID 范围分成 2048 个块。为此,我们至少需要 3 个十六进制数字 (16 ^ 3 = 4096),我们将把它们放在范围的最高有效数字(即最左边的 3 个)中。同样,这应该从mongos
shell运行
// Simply use a for loop for each digit
for ( var x=0; x < 16; x++ ){
for( var y=0; y<16; y++ ) {
// for the innermost loop we will increment by 2 to get 2048 total iterations
// make this z++ for 4096 - that would give ~30MB chunks based on the original figures
for ( var z=0; z<16; z+=2 ) {
// now construct the GUID with zeroes for padding - handily the toString method takes an argument to specify the base
var prefix = "" + x.toString(16) + y.toString(16) + z.toString(16) + "00000000000000000000000000000";
// finally, use the split command to create the appropriate chunk
db.adminCommand( { split : "users.userInfo" , middle : { _id : prefix } } );
}
}
}
完成后,让我们使用sh.status()
助手检查游戏状态:
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"version" : 3,
"minCompatibleVersion" : 3,
"currentVersion" : 4,
"clusterId" : ObjectId("527056b8f6985e1bcce4c4cb")
}
shards:
{ "_id" : "shard0000", "host" : "localhost:30000" }
{ "_id" : "shard0001", "host" : "localhost:30001" }
{ "_id" : "shard0002", "host" : "localhost:30002" }
{ "_id" : "shard0003", "host" : "localhost:30003" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "users", "partitioned" : true, "primary" : "shard0001" }
users.userInfo
shard key: { "_id" : 1 }
chunks:
shard0001 2049
too many chunks to print, use verbose if you want to force print
我们有 2048 个块(由于最小/最大块而额外增加了一个块),但由于平衡器已关闭,它们都仍在原始分片上。所以,让我们重新启用平衡器:
sh.startBalancer();
这将立即开始平衡,并且会相对较快,因为所有块都是空的,但仍需要一点时间(如果与其他集合的迁移竞争,则要慢得多)。一段时间过去后,sh.status()
再次运行,您(应该)拥有它 - 2048 个块都很好地分成 4 个分片,并准备好进行初始数据加载:
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"version" : 3,
"minCompatibleVersion" : 3,
"currentVersion" : 4,
"clusterId" : ObjectId("527056b8f6985e1bcce4c4cb")
}
shards:
{ "_id" : "shard0000", "host" : "localhost:30000" }
{ "_id" : "shard0001", "host" : "localhost:30001" }
{ "_id" : "shard0002", "host" : "localhost:30002" }
{ "_id" : "shard0003", "host" : "localhost:30003" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "users", "partitioned" : true, "primary" : "shard0001" }
users.userInfo
shard key: { "_id" : 1 }
chunks:
shard0000 512
shard0002 512
shard0003 512
shard0001 513
too many chunks to print, use verbose if you want to force print
{ "_id" : "test", "partitioned" : false, "primary" : "shard0002" }
您现在已准备好开始加载数据,但要绝对保证在数据加载完成之前不会发生拆分或迁移,您还需要做一件事 - 在导入期间关闭平衡器和自动拆分:
导入完成后,根据需要反转步骤(sh.startBalancer()
、sh.enableBalancing("users.userInfo")
,然后重新启动mongos
没有--noAutoSplit
)以将所有内容恢复为默认设置。
**
更新:优化速度
**
如果您不着急,上述方法很好。从目前的情况来看,如果您对此进行测试,您会发现,平衡器并不是很快 - 即使是空块。因此,当您增加创建的块数时,平衡所需的时间越长。我已经看到完成平衡 2048 个块需要 30 多分钟,尽管这会因部署而异。
这对于测试或相对安静的集群来说可能没问题,但是在繁忙的集群上,关闭平衡器并且不需要其他更新干扰将更难确保。那么,我们如何加快速度呢?
答案是尽早进行一些手动操作,然后在它们位于各自的分片上时将它们拆分。请注意,这仅适用于某些分片键(如随机分布的 UUID)或某些数据访问模式,因此请注意不要导致数据分布不佳。
使用上面的示例,我们有 4 个分片,因此我们没有进行所有拆分,然后进行平衡,而是拆分为 4 个。然后我们通过手动移动它们在每个分片上放置一个块,最后我们将这些块分成所需的数量。
上面示例中的范围如下所示:
$min --> "40000000000000000000000000000000"
"40000000000000000000000000000000" --> "80000000000000000000000000000000"
"80000000000000000000000000000000" --> "c0000000000000000000000000000000"
"c0000000000000000000000000000000" --> $max
创建这些命令只有 4 个命令,但既然我们有了它,为什么不以简化/修改的形式重新使用上面的循环:
for ( var x=4; x < 16; x+=4){
var prefix = "" + x.toString(16) + "0000000000000000000000000000000";
db.adminCommand( { split : "users.userInfo" , middle : { _id : prefix } } );
}
下面是 thinks 现在的样子——我们有 4 个块,都在 shard0001 上:
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"version" : 4,
"minCompatibleVersion" : 4,
"currentVersion" : 5,
"clusterId" : ObjectId("53467e59aea36af7b82a75c1")
}
shards:
{ "_id" : "shard0000", "host" : "localhost:30000" }
{ "_id" : "shard0001", "host" : "localhost:30001" }
{ "_id" : "shard0002", "host" : "localhost:30002" }
{ "_id" : "shard0003", "host" : "localhost:30003" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "test", "partitioned" : false, "primary" : "shard0001" }
{ "_id" : "users", "partitioned" : true, "primary" : "shard0001" }
users.userInfo
shard key: { "_id" : 1 }
chunks:
shard0001 4
{ "_id" : { "$minKey" : 1 } } -->> { "_id" : "40000000000000000000000000000000" } on : shard0001 Timestamp(1, 1)
{ "_id" : "40000000000000000000000000000000" } -->> { "_id" : "80000000000000000000000000000000" } on : shard0001 Timestamp(1, 3)
{ "_id" : "80000000000000000000000000000000" } -->> { "_id" : "c0000000000000000000000000000000" } on : shard0001 Timestamp(1, 5)
{ "_id" : "c0000000000000000000000000000000" } -->> { "_id" : { "$maxKey" : 1 } } on : shard0001 Timestamp(1, 6)
我们将把$min
块留在原处,并移动其他三个。您可以通过编程方式执行此操作,但这确实取决于块最初所在的位置、您如何命名分片等。所以我现在将离开本手册,它并不太繁重 - 只有 3 个moveChunk
命令:
mongos> sh.moveChunk("users.userInfo", {"_id" : "40000000000000000000000000000000"}, "shard0000")
{ "millis" : 1091, "ok" : 1 }
mongos> sh.moveChunk("users.userInfo", {"_id" : "80000000000000000000000000000000"}, "shard0002")
{ "millis" : 1078, "ok" : 1 }
mongos> sh.moveChunk("users.userInfo", {"_id" : "c0000000000000000000000000000000"}, "shard0003")
{ "millis" : 1083, "ok" : 1 }
让我们仔细检查一下,确保块在我们期望的位置:
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"version" : 4,
"minCompatibleVersion" : 4,
"currentVersion" : 5,
"clusterId" : ObjectId("53467e59aea36af7b82a75c1")
}
shards:
{ "_id" : "shard0000", "host" : "localhost:30000" }
{ "_id" : "shard0001", "host" : "localhost:30001" }
{ "_id" : "shard0002", "host" : "localhost:30002" }
{ "_id" : "shard0003", "host" : "localhost:30003" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "test", "partitioned" : false, "primary" : "shard0001" }
{ "_id" : "users", "partitioned" : true, "primary" : "shard0001" }
users.userInfo
shard key: { "_id" : 1 }
chunks:
shard0001 1
shard0000 1
shard0002 1
shard0003 1
{ "_id" : { "$minKey" : 1 } } -->> { "_id" : "40000000000000000000000000000000" } on : shard0001 Timestamp(4, 1)
{ "_id" : "40000000000000000000000000000000" } -->> { "_id" : "80000000000000000000000000000000" } on : shard0000 Timestamp(2, 0)
{ "_id" : "80000000000000000000000000000000" } -->> { "_id" : "c0000000000000000000000000000000" } on : shard0002 Timestamp(3, 0)
{ "_id" : "c0000000000000000000000000000000" } -->> { "_id" : { "$maxKey" : 1 } } on : shard0003 Timestamp(4, 0)
这与我们上面建议的范围相匹配,所以一切看起来都不错。现在运行上面的原始循环以在每个分片上“就地”分割它们,一旦循环结束,我们应该有一个平衡的分布。还有一个sh.status()
应该确认的事情:
mongos> for ( var x=0; x < 16; x++ ){
... for( var y=0; y<16; y++ ) {
... // for the innermost loop we will increment by 2 to get 2048 total iterations
... // make this z++ for 4096 - that would give ~30MB chunks based on the original figures
... for ( var z=0; z<16; z+=2 ) {
... // now construct the GUID with zeroes for padding - handily the toString method takes an argument to specify the base
... var prefix = "" + x.toString(16) + y.toString(16) + z.toString(16) + "00000000000000000000000000000";
... // finally, use the split command to create the appropriate chunk
... db.adminCommand( { split : "users.userInfo" , middle : { _id : prefix } } );
... }
... }
... }
{ "ok" : 1 }
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"version" : 4,
"minCompatibleVersion" : 4,
"currentVersion" : 5,
"clusterId" : ObjectId("53467e59aea36af7b82a75c1")
}
shards:
{ "_id" : "shard0000", "host" : "localhost:30000" }
{ "_id" : "shard0001", "host" : "localhost:30001" }
{ "_id" : "shard0002", "host" : "localhost:30002" }
{ "_id" : "shard0003", "host" : "localhost:30003" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "test", "partitioned" : false, "primary" : "shard0001" }
{ "_id" : "users", "partitioned" : true, "primary" : "shard0001" }
users.userInfo
shard key: { "_id" : 1 }
chunks:
shard0001 513
shard0000 512
shard0002 512
shard0003 512
too many chunks to print, use verbose if you want to force print
你有它 - 无需等待平衡器,分布已经均匀。