python - Python xml 解析器如何检测编码（utf-8 vs utf-16）？

翻译自：https://stackoverflow.com/questions/65301395 2020-12-15T07:07:04.747

181 次

Python XML Parser 可以解析各种编码的字节串（即使 XML 标头中没有指定编码）：

from xml.etree import ElementTree as ET

xml_string = '<doc>Glück</doc>'

xml_utf_8 = xml_string.encode('utf-8')
xml_utf_16 = xml_string.encode('utf-16')

print(ET.fromstring(xml_utf_8).text)
print(ET.fromstring(xml_utf_16).text)

输出：

Glück
Glück

问题：

让解析器检测到正确的编码是否安全（utf-8 与 utf-16，如果解析器中未指定其他编码会失败）？
检测似乎是在 expat C 库中完成的。它如何可靠地检测到正确的编码？

python - Python xml 解析器如何检测编码（utf-8 vs utf-16）？

0 回答 0

Related

Reference