我正忙于编写一个迁移,这将使我们能够将我们的 yamler 从 Syck 迁移到 Psych,并最终将我们的项目升级到 ruby 2。不过,这种迁移将非常耗费资源,因此我将需要使用分块。
我编写了以下方法来确认我计划使用的迁移结果产生了预期的结果,并且可以在不停机的情况下完成。为了避免活动记录自动执行序列化,我需要使用ActiveRecord::Base.connection.execute
我描述转换的方法如下
def show_summary(table, column_name)
a = ActiveRecord::Base.connection.execute <<-SQL
SELECT id, #{column_name} FROM #{table}
SQL
all_rows = a.to_a; ""
problem_rows = all_rows.select do |row|
original_string = Syck.dump(Syck.load(row[1]))
orginal_object = Syck.load(original_string)
new_string = Psych.dump(orginal_object)
new_object = Syck.load(new_string)
Syck.dump(new_object) != original_string rescue true
end
problem_rows.map do |row|
old_string = Syck.dump(Syck.load(row[1]))
new_string = Psych.dump(Syck.load(old_string)) rescue "Parse failure"
roundtrip_string = begin
Syck.dump(Syck.load(new_string))
rescue => e
e.message
end
new_row = {}
new_row[:id] = row[0]
new_row[:original_encoding] = old_string
new_row[:new_encoding] = roundtrip_string
new_row
end
end
使用时如何使用批处理 ActiveRecord::Base.connection.execute
?
为了完整起见,我的更新功能如下
# Migrate the given serialized YAML column from Syck to Psych
# (if any).
def migrate_to_psych(table, column)
table_name = ActiveRecord::Base.connection.quote_table_name(table)
column_name = ActiveRecord::Base.connection.quote_column_name(column)
fetch_data(table_name, column_name).each do |row|
transformed = ::Psych.dump(convert(Syck.load(row[column])))
ActiveRecord::Base.connection.execute <<-SQL
UPDATE #{table_name}
SET #{column_name} = #{ActiveRecord::Base.connection.quote(transformed)}
WHERE id = #{row['id']};
SQL
end
end
def fetch_data(table_name, column_name)
ActiveRecord::Base.connection.select_all <<-SQL
SELECT id, #{column_name}
FROM #{table_name}
WHERE #{column_name} LIKE '---%'
SQL
end
我从http://fossies.org/linux/openproject/db/migrate/migration_utils/legacy_yamler.rb得到的