python - 使用 boto 来压缩文件而不是 sfs3

Question

import contextlib
import gzip

import s3fs

AWS_S3 = s3fs.S3FileSystem(anon=False) # AWS env must be set up correctly

source_file_path = "/tmp/your_file.txt"
s3_file_path = "my-bucket/your_file.txt.gz"

with contextlib.ExitStack() as stack:
    source_file = stack.enter_context(open(source_file_path , mode="rb"))
    destination_file = stack.enter_context(AWS_S3.open(s3_file_path, mode="wb"))
    destination_file_gz = stack.enter_context(gzip.GzipFile(fileobj=destination_file))
    while True:
        chunk = source_file.read(1024)
        if not chunk:
            break
        destination_file_gz.write(chunk)

我试图在 AWS Lambda 函数上运行类似的东西，但它会引发错误，因为它无法安装 s3fs 模块。另外，我在代码的其余部分使用 boto，所以我想坚持使用 boto。我如何也可以使用 boto 呢？

基本上，我正在从“/tmp/path”打开/读取文件，对其进行 gzip 压缩，然后保存到 S3 存储桶

编辑：

s3_resource = boto3.resource('s3')
bucket = s3_resource.Bucket('testunzipping')
s3_filename = 'samplefile.csv.'
      
   for i in testList:
        #zip_ref.open(i, ‘r’)
        with contextlib.ExitStack() as stack:
            source_file = stack.enter_context(open(i , mode="rb"))
            destination_file = io.BytesIO()
            destination_file_gz = stack.enter_context(gzip.GzipFile(fileobj=destination_file, mode='wb'))
            while True:
                chunk = source_file.read(1024)
                if not chunk:
                    break
                destination_file_gz.write(chunk)
            destination_file.seek(0)
            
            fileName = i.replace("/tmp/DataPump_10000838/", "") 
            bucket.upload_fileobj(destination_file, fileName)

testList 中的每个项目看起来像这样 "/tmp/your_file.txt"

score 1 · Accepted Answer

AWS Lambda 函数，但由于无法安装 s3fs 模块而引发错误

额外的包和你自己的 lib 代码（可重用代码）应该放在 lambda 层中。

我如何也可以使用 boto 呢？

s3 = boto3.resource("s3")
bucket = s3.Bucket(bucket_name)

然后：

如果您的文件在内存中（类似文件的对象，以字节模式打开，例如io.BytesIO或只是open(..., 'b')）

bucket.upload_fileobj(fileobj, s3_filename)

或者，如果您在当前空间中有文件：

bucket.upload_file(filepath, s3_filename)

https://boto3.amazonaws.com/v1/documentation/api/1.18.53/reference/services/s3.html#S3.Bucket.upload_file

python - 使用 boto 来压缩文件而不是 sfs3

1 回答 1

Related

Reference