python - 将文件和对象写入亚马逊 s3

Question

我正在使用亚马逊 S3 将动态生成的文件分发到 S3。

在本地服务器上，我可以使用

destination = open(VIDEO_DIR + newvideo.name, 'wb+')

将生成的视频存储到 VIDEO_DIR.newvideo.name 位置

是否有可行的方法将 VIDEO_DIR 更改为 S3 端点位置。那么动态生成的视频可以直接写入S3服务器吗？

另一个问题是：有没有可行的方法直接将对象写入S3？比如a chunklet=Chunklet()，如何将这个chunklet对象直接写入S3服务器？

我可以先创建一个本地文件并使用 S3 API。例如，

mime = mimetypes.guess_type(filename)[0]
k = Key(b)
k.key = filename
k.set_metadata("Content-Type", mime)
k.set_contents_from_filename(filename)
k.set_acl('public-read')

但我想提高效率。使用 Python。

score 4 · Accepted Answer

使用boto库访问您的 S3 存储。但是，您仍然必须先将数据写入（临时）文件，然后才能发送它，因为尚未实现流写入方法。

我会使用上下文管理器来解决这个限制：

import tempfile
from contextlib import contextmanager

@contextmanager
def s3upload(key):
    with tempfile.SpooledTemporaryFile(max_size=1024*10) as buffer:  # Size in bytes
        yield buffer  # After this, the file is typically written to
        buffer.seek(0)  # So that reading the file starts from its beginning
        key.set_contents_from_file(buffer)

将其用作上下文管理的文件对象：

k = Key(b)
k.key = filename
k.set_metadata("Content-Type", mime)

with s3upload(k) as out:
    out.write(chunklet)

score 1 · Accepted Answer

Martijn's solution is great but it forces you to use the file in a context manager (you can't do out = s3upload(…) and print >> out, "Hello"). The following solution works similarly (in-memory storage up until a certain size), but works both as a context manager and as a regular file (you can do both with S3WriteFile(…) and out = S3WriteFile(…); print >> out, "Hello"; out.close()):

import tempfile
import os

class S3WriteFile(object):
    """
    File-like context manager that can be written to (and read from),
    and which is automatically copied to Amazon S3 upon closing and deletion.
    """

    def __init__(self, item, max_size=10*1024**2):
        """
        item -- boto.s3.key.Key for writing the file (upon closing). The
        name of the object is set to the item's name (key).

        max_size -- maximum size in bytes of the data that will be
        kept in memory when writing to the file. If more data is
        written, it is automatically rolled over to a file.
        """

        self.item = item

        temp_file = tempfile.SpooledTemporaryFile(max_size)

        # It would be useless to set the .name attribute of the
        # object: when using it as a context manager, the temporary
        # file is returned, which as a None name:
        temp_file.name = os.path.join(
            "s3://{}".format(item.bucket.name),
            item.name if item.name is not None else "<???>")

        self.temp_file = temp_file

    def close(self):
        self.temp_file.seek(0)
        self.item.set_contents_from_file(self.temp_file)
        self.temp_file.close()

    def __del__(self):
        """
        Write the file contents to S3.
        """
        # The file may have been closed before being deleted:
        if not self.temp_file.closed:
            self.close()

    def __enter__(self):
        return self.temp_file

    def __exit__(self, *args, **kwargs):
        self.close()
        return False

    def __getattr__(self, name):
        """
        Everything not specific to this class is delegated to the
        temporary file, so that objects of this class behave like a
        file.
        """
        return getattr(self.temp_file, name)

(Implementation note: instead of delegating many things to self.temp_file so that the resulting class behaves like a file, inheriting from SpooledTemporaryFile would in principle work. However, this is an old-style class, so __new__() is not called, and, as far as I can see, a non-default in-memory size for the temporary data cannot be set.)

python - 将文件和对象写入亚马逊 s3

2 回答 2

Related

Reference