来自什么字符集é
?在 Windows 记事本中,在 ANSI 文本文件中包含此字符可以很好地保存。插入类似的东西,你会得到一个错误。
é
似乎在 Putty 的 ASCII 终端中工作正常(CP437 和 IBM437 是否相同?)而没有。
我可以看到这是 Unicode,而不是 ASCII。但什么是
é
?它不会给出我在记事本中使用 Unicode 时遇到的错误,但是SyntaxError: Non-ASCII character '\xc3' in file on line , but no encoding declared;
在我添加 Python NLTK 所建议的“魔术注释”之前,Python 抛出了:SyntaxError: Non-ASCII character '\xc3' in file (Sentiment Analysis -NLP)。
我添加了“魔术注释”并且没有收到该错误,但是 os.path.isfile() 说文件名é
不存在。具有讽刺意味的是,该字符é
位于Marc-André Lemburg
错误链接到的 PEP 的作者中。
编辑:如果我打印文件的路径,重音 e 显示为,├⌐
但我可以复制并粘贴é
到命令提示符中。
EDIT2:见下文
Private > cat scratch.py ### LOL cat scratch :3
# coding=utf-8
file_name = r"Filéname"
file_name = unicode(file_name)
Private > python scratch.py
Traceback (most recent call last):
File "scratch.py", line 3, in <module>
file_name = unicode(file_name)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3: ordinal not in range(128)
Private >
编辑3:
Private > PS1="Private > " ; echo code below ; cat scratch.py ; echo ======= ; echo output below ; python scratch.py
code below
# -*- coding: utf-8 -*-
file_name = r"Filéname"
file_name = unicode(file_name, encoding="utf-8")
# I have code here to determine a path depending on the hostname of the
# machine, the folder paths contain no Unicode characters, for my debug
# version of the script, I will hardcode the redacted hostname.
hostname = "One"
if hostname == "One":
folder = "C:/path/folder_one"
elif hostname == "Two":
folder = "C:/path/folder_two"
else:
folder = "C:/path/folder_three"
path = "%s/%s" % (folder, file_name)
path = unicode(path, encoding="utf-8")
print path
=======
output below
Traceback (most recent call last):
File "scratch.py", line 18, in <module>
path = unicode(path, encoding="utf-8")
TypeError: decoding Unicode is not supported
Private >