python - Python：替换字符串列表中的非 ascii 字符

Question

我知道在 stackoverflow 上有很多非 ascii 字符的问题，但是由于我是一个完全的新手，所以我没有成功实施它们的运气，而且我发现整个 'unicode' 概念很难理解。

所以我有一个清单 -

mylist = ["apple", "samsung", "toshiba", "Don’t know", "Can’t recall"]

我想访问索引 3 和 4 处的单引号并将它们替换为撇号。

我试过这个：

# -*- coding: utf-8 -*-
mylist = ["hello", "don't know", "Don’t know", "Can't recall"]
for word in mylist:
    word.replace(u"’", "'")
print mylist

我收到以下错误：

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 3: ordinal not in range(128)

不确定这是否有用，但我使用的是 python 版本 2.x，我知道如果我使用的是版本 3，则可能不会出现此问题。

谢谢！

score 1 · Accepted Answer

>>> mylist = ["apple", "samsung", "toshiba", "Don’t know", "Can’t recall"]
>>> [item.replace('\xe2\x80\x99',"'") for item in mylist]
['apple', 'samsung', 'toshiba', "Don't know", "Can't recall"]

如果所有项目都已经是 unicode：

>>> mylist = [u"apple", u"samsung", u"toshiba", u"Don’t know", u"Can’t recall"]
>>> [item.replace(u'’',u"'") for item in mylist]
[u'apple', u'samsung', u'toshiba', u"Don't know", u"Can't recall"]

python - Python：替换字符串列表中的非 ascii 字符

1 回答 1

Related

Reference