1

我编写了一个脚本来将 s3 对象从我的生产 s3 存储桶复制到我的开发存储桶,但是运行需要很长时间,因为我在复制之前单独检查每个对象是否存在。有没有办法区分两个桶并只复制我需要的对象?还是将存储桶作为一个整体进行复制?

这是我目前拥有的:

count = 0
puts "COPYING FROM #{prod_bucket} to #{dev_bucket}"
bm = Benchmark.measure do 
  AWS::S3.new.buckets[prod_bucket].objects.each do |o|
    exists = AWS::S3.new.buckets[dev_bucket].objects[o.key].exists?

    if exists
      puts "Skipping: #{o.key}"
    else
      puts "Copy: #{o.key} (#{count})"
      o.copy_to(o.key, :bucket_name => dev_bucket, :acl => :public_read)
      count += 1
    end
  end
end
puts "Copied #{count} objects in #{bm.real}s"
4

1 回答 1

2

我从未使用过那个 gem,但是您的代码看起来可以接收一个数组,其中所有项目都存储在一个存储桶中。为两个存储桶加载该列表,并使用简单的数组操作确定丢失的文件。应该快得多。

# load file lists (looks up objects in batches of 1000)
source_files  = AWS::S3.new.buckets[prod_bucket].objects.map(&:key)
target_files  = AWS::S3.new.buckets[dev_bucket].objects.map(&:key)

# determine files missing in dev
files_to_copy = source_files - target_files
files_to_copy.each_with_index do |file_name, i|
  puts "Coping #{i}/#{files_to_copy.size}: #{file_name}"

  S3Object.store(file_name, 
                 S3Object.value(file_name, PROD_BUCKET_NAME), 
                 DEV_BUCKET_NAME)
end

# determine files on dev that are not existing on prod
files_to_remove = target_files - source_files
files_to_remove.each_with_index do |file_name, i|
  puts "Removing #{i}/#{files_to_remove.size}: #{file_name}"

  S3Object.delete(file_name, DEV_BUCKET_NAME)
end
于 2013-10-01T06:49:38.983 回答