您可以使用 MongoDB map/reduce 来“帮助”迁移数据,遗憾的是您不能使用它来进行完全的服务器端迁移。你走在正确的轨道上,基本的想法是:
- 将每个评论映射到 emit(post_id, {comment_count: 1}) ---> {_id: post_id, value: {comment_count: 1}}
- 减少到值 {comment_count: N} N 是计数和 ---> {_id: post_id, value: {comment_count: N}}
- 指定输出选项 {reduce: 'posts'} 以将 map/reduce comment_counts 的结果减少回帖子集合
经过一番广泛的调查,我发现你可以接近,但有一个问题阻止你完全进行服务器端迁移。reduce 的结果具有形状 {_id: KEY, value: MAP_REDUCE_VALUE}。我们现在被这个形状困住了,似乎没有办法绕过它。因此,您既不能获取此形状之外的完整原始文档作为 reduce 的输入(实际上,您将丢失此形状之外的数据),也不能更新此形状之外的文档作为 reduce 的结果。因此,您的帖子集合的“最终”更新必须通过客户端以编程方式完成。看起来修复这将是一个很好的修改请求。
下面找到一个工作示例,该示例演示了在 Ruby 中使用 MongoDB map/reduce 来计算所有 comment_counts。然后我以编程方式使用 map_reduce_results 集合来更新 posts 集合中的 comment_count。reduce 函数被从尝试中剥离出来: {reduce: 'posts'}
您可以通过一些实验来验证我的答案,或者如果您愿意,我可以根据要求发布完全不工作的服务器端尝试,并使用固定模型完成。希望这有助于理解 Ruby 中的 MongoDB map/reduce。
测试/单元/comment_test.rb
require 'test_helper'
class CommentTest < ActiveSupport::TestCase
def setup
@map_reduce_results_name = 'map_reduce_results'
delete_all
end
def delete_all
Post.delete_all
Comment.delete_all
Mongoid.database.drop_collection(@map_reduce_results_name)
end
def dump(title = nil)
yield
puts title
Post.all.to_a.each do |post|
puts "#{post.to_json} #{post.comments.collect(&:text).to_json}"
end
end
def generate
(2+rand(2)).times do |p|
post = Post.create(text: 'post_' + p.to_s)
comments = (2+rand(3)).times.collect do |c|
Comment.create(text: "post_#{p} comment_#{c}")
end
post.comments = comments
end
end
def generate_and_migrate(title = nil)
dump(title + ' generate:') { generate }
dump(title + ' migrate:') { yield }
end
test "map reduce migration" do
generate_and_migrate('programmatic') do
Post.all.each do |p|
p.update_attribute :comment_count, p.comments.count
end
end
delete_all
generate_and_migrate('map/reduce') do
map = "function() { emit( this.post_id, {comment_count: 1} ); }"
reduce = <<-EOF
function(key, values) {
var result = {comment_count: 0};
values.forEach(function(value) { result.comment_count += value.comment_count; });
return result;
}
EOF
out = @map_reduce_results_name #{reduce: 'posts'}
result_coll = Comment.collection.map_reduce(map, reduce, out: out)
puts "#{@map_reduce_results_name}:"
result_coll.find.each do |doc|
p doc
Post.find(doc['_id']).update_attribute :comment_count, doc['value']['comment_count'].to_i
end
end
end
end
测试输出(抱歉混用 JSON 和 Ruby 检查)
Run options: --name=test_map_reduce_migration
# Running tests:
programmatic generate:
{"_id":"4fcae3bde4d30b21e2000001","comment_count":null,"text":"post_0"} ["post_0 comment_0","post_0 comment_1","post_0 comment_2"]
{"_id":"4fcae3bde4d30b21e2000005","comment_count":null,"text":"post_1"} ["post_1 comment_1","post_1 comment_0","post_1 comment_2","post_1 comment_3"]
{"_id":"4fcae3bde4d30b21e200000a","comment_count":null,"text":"post_2"} ["post_2 comment_1","post_2 comment_3","post_2 comment_0","post_2 comment_2"]
programmatic migrate:
{"_id":"4fcae3bde4d30b21e2000001","comment_count":3,"text":"post_0"} ["post_0 comment_0","post_0 comment_1","post_0 comment_2"]
{"_id":"4fcae3bde4d30b21e2000005","comment_count":4,"text":"post_1"} ["post_1 comment_1","post_1 comment_0","post_1 comment_2","post_1 comment_3"]
{"_id":"4fcae3bde4d30b21e200000a","comment_count":4,"text":"post_2"} ["post_2 comment_1","post_2 comment_3","post_2 comment_0","post_2 comment_2"]
map/reduce generate:
{"_id":"4fcae3bee4d30b21e200000f","comment_count":null,"text":"post_0"} ["post_0 comment_0","post_0 comment_1"]
{"_id":"4fcae3bee4d30b21e2000012","comment_count":null,"text":"post_1"} ["post_1 comment_2","post_1 comment_0","post_1 comment_1"]
{"_id":"4fcae3bee4d30b21e2000016","comment_count":null,"text":"post_2"} ["post_2 comment_0","post_2 comment_1","post_2 comment_2","post_2 comment_3"]
map_reduce_results:
{"_id"=>BSON::ObjectId('4fcae3bee4d30b21e200000f'), "value"=>{"comment_count"=>2.0}}
{"_id"=>BSON::ObjectId('4fcae3bee4d30b21e2000012'), "value"=>{"comment_count"=>3.0}}
{"_id"=>BSON::ObjectId('4fcae3bee4d30b21e2000016'), "value"=>{"comment_count"=>4.0}}
map/reduce migrate:
{"_id":"4fcae3bee4d30b21e200000f","comment_count":2,"text":"post_0"} ["post_0 comment_0","post_0 comment_1"]
{"_id":"4fcae3bee4d30b21e2000012","comment_count":3,"text":"post_1"} ["post_1 comment_2","post_1 comment_0","post_1 comment_1"]
{"_id":"4fcae3bee4d30b21e2000016","comment_count":4,"text":"post_2"} ["post_2 comment_0","post_2 comment_1","post_2 comment_2","post_2 comment_3"]
.
Finished tests in 0.072870s, 13.7231 tests/s, 0.0000 assertions/s.
1 tests, 0 assertions, 0 failures, 0 errors, 0 skips