1

解析XML文件时:

from lxml import etree

with open('cortex_full.xml', 'r') as infile:
    root = etree.parse(infile)

我得到UnicodeDecodeError以下。不过,这只发生在我的 Mac 上 - 如果我在工作 PC 上使用相同的脚本解析相同的文件,一切正常。

File "/Users/Desktop/CPET/xml_test2.py", line 5, in <module>
    root = etree.parse(infile)
  File "src/lxml/lxml.etree.pyx", line 3442, in lxml.etree.parse (src/lxml/lxml.etree.c:81701)
  File "src/lxml/parser.pxi", line 1832, in lxml.etree._parseDocument (src/lxml/lxml.etree.c:118888)
  File "src/lxml/parser.pxi", line 1852, in lxml.etree._parseFilelikeDocument (src/lxml/lxml.etree.c:119171)
  File "src/lxml/parser.pxi", line 1747, in lxml.etree._parseDocFromFilelike (src/lxml/lxml.etree.c:117959)
  File "src/lxml/parser.pxi", line 1162, in lxml.etree._BaseParser._parseDocFromFilelike (src/lxml/lxml.etree.c:112686)
  File "src/lxml/parser.pxi", line 595, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:105881)
  File "src/lxml/parser.pxi", line 702, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:107548)
  File "src/lxml/lxml.etree.pyx", line 324, in lxml.etree._ExceptionContext._raise_if_stored (src/lxml/lxml.etree.c:12152)
  File "src/lxml/parser.pxi", line 373, in lxml.etree._FileReaderContext.copyToBuffer (src/lxml/lxml.etree.c:103210)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 783: ordinal not in range(128)

考虑到此处的线程数,这似乎很常见,但是建议的修复程序似乎都不适用于此实例。让它工作的任何想法?完整XML文件在这里

4

1 回答 1

1

发布对我有用的答案以供将来参考。归功于@Burhan Khalid 的答案。

utf-8打开xml文件时需要设置编码。

with open('cortex_full.xml', 'r', encoding='utf-8') as infile:
于 2017-09-30T02:51:24.820 回答