1

If I have a file say a img, and I try to divide the file into 14xx byte chunks by reading 1 byte at a time through a generator and join them together in a variable, why is the resulting variable not 14xx byte? Is it because of the internal handling of the variable in python? If so what are some possible way to test if I actually have a 14xx data besides having my create_data function return another indicator?

def split_file(self, filename):
    with open(filename, "rb") as f:
        while True:
            byte = f.read(1)
            if not byte:
                break
            yield(byte)


def create_data(self):
    for x in range (1, 1472):
        next_byte = split_file.filename

        if not next_byte :
            break
        else: 
            msg = msg + split_file(self.filename)
    return msg

curr_data = self.create_data
    while sys.getsizeof(curr_data) == 1472: 
        # do something with curr_data

Thanks in advance

4

3 回答 3

2

你想要len(),没有sys.getsizeof()sys.getsizeof()包括 Python 对象的开销。您还会注意到它在诸如列表之类的容器上给出了“奇怪”的行为(即,可能不是您所期望的):它计算容器使用的内存,而不是其中的对象。

于 2013-10-27T18:03:01.360 回答
1

您可能希望使用一个生成器来实际读取您希望大小的文件块:

def split_file(self, filename, size=1472):
    with open(filename, "rb") as f:
        while True:
            buf= f.read(size)
            if not buf:
                break
            yield(buf)

如果您这样做,则不需要对您当前拥有的 1472 次调用split_file和 1472 次字符串附加。create_data

然后你可以这样做:

for chunk in split_file(self.filename, self.size):
    # if you want to discard the last chunk if len is less than size:
    if len(chunk)<self.size:
        break

    #otherwise, deal with a chunk:
    ...
于 2013-10-27T19:42:14.827 回答
0

在这种情况下,我只使用一个名为 的函数super_len,它适用于所有内容,我从请求的utils.py文件中获取它:

def super_len(o):
    if hasattr(o, '__len__'):
        return len(o)

    if hasattr(o, 'len'):
        return o.len

    if hasattr(o, 'fileno'):
        try:
            fileno = o.fileno()
        except io.UnsupportedOperation:
            pass
        else:
            return os.fstat(fileno).st_size

    if hasattr(o, 'getvalue'):
        # e.g. BytesIO, cStringIO.StringI
        return len(o.getvalue())

正如kindall所说,您需要使用len而不是sys.getsizeof. super_len适用于我遇到的所有情况。

于 2013-10-27T18:09:31.767 回答