0

我有这个协程,它旨在接受数据,然后将数据发送到链中的下一个协程,但以长度为块的形式发送blocksize。由于字符串是不可变的,我认为我正在做的字符串附加效率很低,因为每次附加都会创建一个新的字符串对象。

由于这只是这些链中一些适当工作之间的“胶水”,因此能够使其尽可能光滑会很好。

def chunker(target, blocksize=DEFAULT_BLOCK_SIZE):
    buffer = ""
    target_send = target.send
    while True:
        try:
            input_data = yield
            buffer += input_data  # creates new string object every time
            buffer_len = len(buffer)
            if buffer_len >= blocksize:
                chunks, leftover = divmod(buffer_len, blocksize)
                for i in xrange(0, chunks*blocksize, blocksize):
                    target_send(buffer[i:i+blocksize])
                buffer = buffer[-leftover:] if leftover else ""
        except CleanUp:
            if buffer:
                target_send(buffer)
            target_send("")

我该如何改进呢?或者更好的是,有没有更简单的方法来实现这一点?

4

1 回答 1

2

一种选择是维护每个块的列表,然后''.join()在到达时维护它们blocksize,这应该比字符串连接更有效。例如(未经测试):

def chunker(target, blocksize=DEFAULT_BLOCK_SIZE):
    data = []
    buffer = ''
    buffer_len = 0
    target_send = target.send
    while True:
        try:
            input_data = yield
            data.append(input_data)
            buffer_len += len(input_data)
            if buffer_len >= blocksize:
                buffer = ''.join(data)
                chunks, leftover = divmod(buffer_len, blocksize)
                for i in xrange(0, chunks*blocksize, blocksize):
                    target_send(buffer[i:i+blocksize])
                buffer = buffer[-leftover:] if leftover else ""
                buffer_len = len(buffer)
                data = [buffer] if buffer else []
        except CleanUp:
            buffer = ''.join(data)
            if buffer:
                target_send(buffer)
            target_send("")
于 2012-12-12T19:45:42.403 回答