python - 删除特殊引号和其他字符

问问题 2017-06-28T13:50:31.160

759 次

1 回答 1

使用unidecode包通常会将这些字符替换为 utf-8 字符。

from unidecode import unidecode
text = unidecode(text)

然而，一个缺点是您还会更改一些您可能想要保留的字符（例如重音字符）。如果是这种情况，一个选项是使用正则表达式来专门擦除（或替换）一些预先识别的特殊字符：

import re
exotic_quotes = ['\\x92'] # fill this up
text = re.sub(exotic_quotes, "'", text) # changing the second argument to fill the kind of quote you want to replace the exotic ones with

我希望这有帮助！

于 2017-06-28T14:04:01.240 回答

python - 删除特殊引号和其他字符

1 回答 1

Related

Reference