python - Python修复法语口音解析为=C3=A9

Question

在 python 中，我遇到了一些来自法语的字符串，这些字符串带有我无法转换回正常的口音，例如：

word1 = 'install=C3=A9' # should be installé
word2 = 'transf=E9r=E9' # should be transféré
word3 = 'bient=C3=B4t'  # should be bientôt

我阅读的大多数文档都指定读取具有一些 encodings='utf-8' 左右的文件，但在这里我坚持使用实际字符串。有没有办法解码字符串或者我应该构建一个 maximega .replace() 函数？

score 4 · Accepted Answer

编码似乎是Quoted Printable。

import quopri
word1 = 'install=C3=A9'
byteString = quopri.decodestring(word1)
string = byteString.decode('utf-8')
print(string)

实际上，该函数需要字节作为输入，因此将单词声明为字节会更好：

word1 = b'install=C3=A9'

1 回答 1