python-3.x - 'cp866' 的 Python 编码/解码错误

Question

6.5，我试图从 CSV 文件中提取一些信息，但文件是用俄语编写的，所以我需要使用 'cp866' 来解码。但是，我无法获得正确的输出。

这是我使用的代码：

def printcsv():
    with open('vocabulary.csv',newline='') as f:
      reader = csv.reader(f)
      for row in reader:
          #store in array
          print(row.decode('cp866'))

这是我得到的错误：

"/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xa7 in position 0: ordinal not in range(128)

score 0 · Accepted Answer

哎呀，这不是读取编码的 csv 文件的正确方法。这是您尝试执行的操作：

with open('vocabulary.csv',newline='') as f: # open the file with default system encoding
  reader = csv.reader(f)                     # declare a reader on it
  for row in reader:                         # here comes the problem

我假设您的系统使用 ASCII 作为默认编码。因此，当阅读器尝试加载一行时，会从文件中读取一行（字节）并将其解码为默认 ascii 编码的字符串。

无论如何，row是一个列表而不是一个字符串，所以row.decode如果你到达那一行就会引发错误。

如果在打开文件时指定文件编码的正确方法：

def printcsv():
    with open('vocabulary.csv',newline='', encoding='cp866') as f:
      reader = csv.reader(f)
      for row in reader:
          #store in array

但我不确定

          print(row)

根据使用的编码sys.stdout，您可能必须对数组中的每个字符串进行显式编码：

          print([ field.encode(encoding) for field in row ])

python-3.x - 'cp866' 的 Python 编码/解码错误

1 回答 1

Related

Reference