我正在尝试遵循 Intro to Data Sci coursera 课程。但是我在尝试解析来自 twitter 的 json 响应时遇到了问题
我正在尝试从以下格式的 json 中检索文本。
{u'delete': {u'status': {u'user_id_str': u'702327198', u'user_id': 702327198, u'id': 332772178690981889L, u'id_str': u'332772178690981889'}}}, {u'delete': {u'status': {u'user_id_str': u'864736118', u'user_id': 864736118, u'id': 332770710667792384L, u'id_str': u'332770710667792384'}}}, {u'contributors': None, u'truncated': False, **u'text'**: u'RT @afgansyah_reza: Lagi ngantri. Ada ibu2 & temennya. "Ih dia mukanya mirip banget sama Afgan.", trus ngedeketin gw, "Tuh kan.. Mirip bang\u2026', u'in_reply_to_status_id': None, u'id': 332772350640668672L, u'favorite_count': 0, ....... ]
这是我使用的代码:
def hw():
data = []
count=0
with open('output.txt') as f:
for line in f:
encoded_string = line.strip().encode('utf-8')
data.append(json.loads(encoded_string))
print data# generates the input to next block
for listval in data:#individual block
if "text" in listval:
print listval["text"]
else:
continue
但是,当我运行它时,我得到以下输出和错误
RT @afgansyah_reza: Lagi ngantri. Ada ibu2 & temennya. "Ih dia mukanya mirip banget sama Afgan.", trus ngedeketin gw, "Tuh kan.. Mirip bang…
RT @Dimaz_CSIX: Kolor pakek pita #laguharlemshake
Traceback (most recent call last):
File "F:\ProgrammingPoint\workspace-new\PyTest\tweet_sentiment.py", line 41, in <module>
main()
File "F:\ProgrammingPoint\workspace-new\PyTest\tweet_sentiment.py", line 36, in main
hw()
File "F:\ProgrammingPoint\workspace-new\PyTest\tweet_sentiment.py", line 23, in hw
print listval["text"]
File "C:\Python27\lib\encodings\cp1252.py", line 12, in encode
return codecs.charmap_encode(input,errors,encoding_table)
UnicodeEncodeError: 'charmap' codec can't encode characters in position 13-63: character maps to <undefined>
我是 Python 的新手,任何帮助将不胜感激。