python - 为什么 TextIOWrapper 关闭给定的 BytesIO 流？

Question

如果我在 python 3 中运行以下代码

from io import BytesIO
import csv
from io import TextIOWrapper


def fill_into_stringio(input_io):
    writer = csv.DictWriter(TextIOWrapper(input_io, encoding='utf-8'),fieldnames=['ids'])
    for i in range(100):
        writer.writerow({'ids': str(i)})

with BytesIO() as input_i:
    fill_into_stringio(input_i)
    input_i.seek(0)

我收到一个错误：

ValueError: I/O operation on closed file.

而如果我不使用 TextIOWrapper，则 io 流将保持打开状态。例如，如果我将函数修改为

def fill_into_stringio(input_io):
    for i in range(100):
        input_io.write(b'erwfewfwef')

我不再收到任何错误，因此出于某种原因 TestIOWrapper 正在关闭我以后想从中读取的流。这是否打算像这样，是否有办法在不自己编写 csv 编写器的情况下实现我正在尝试的目标？

score 8 · Accepted Answer

该csv模块在这里很奇怪；大多数包装其他对象的类似文件的对象都假定有问题的对象的所有权，当它们自己关闭（或以其他方式清理）时关闭它。

避免该问题的一种方法是detach从TextIOWrapper允许清理之前明确地：

def fill_into_stringio(input_io):
    # write_through=True prevents TextIOWrapper from buffering internally;
    # you could replace it with explicit flushes, but you want something 
    # to ensure nothing is left in the TextIOWrapper when you detach
    text_input = TextIOWrapper(input_io, encoding='utf-8', write_through=True)
    try:
        writer = csv.DictWriter(text_input, fieldnames=['ids'])
        for i in range(100):
            writer.writerow({'ids': str(i)})
    finally:
        text_input.detach()  # Detaches input_io so it won't be closed when text_input cleaned up

避免这种情况的唯一其他内置方法是针对真实文件对象，您可以在其中向它们传递文件描述符，并且在-ed 或以其他方式清理closefd=False时它们不会关闭底层文件描述符。close

当然，在您的特定情况下，有一种更简单的方法：只需让您的函数期望基于文本的类文件对象并使用它们而无需重新包装；你的函数真的不应该负责对调用者的输出文件进行编码（如果调用者想要 UTF-16 输出怎么办？）。

然后你可以这样做：

from io import StringIO

def fill_into_stringio(input_io):
    writer = csv.DictWriter(input_io, fieldnames=['ids'])
    for i in range(100):
        writer.writerow({'ids': str(i)})

# newline='' is the Python 3 way to prevent line-ending translation
# while continuing to operate as text, and it's recommended for any file
# used with the csv module
with StringIO(newline='') as input_i:
    fill_into_stringio(input_i)
    input_i.seek(0)
    # If you really need UTF-8 bytes as output, you can make a BytesIO at this point with:
    # BytesIO(input_i.getvalue().encode('utf-8'))

python - 为什么 TextIOWrapper 关闭给定的 BytesIO 流？

1 回答 1

Related

Reference