我知道这个问题与至少十几个其他问题完全相同。但是我只有在确信这些问题很少能解决我的问题后才无奈地发布了这个问题。
基本上我想从包含各种语言字符的网站中获取内容并将它们插入数据存储区。但无论我尝试了什么,错误似乎都没有改变。
我的示例代码:
class URLEntry(db.Model):
content = db.TextProperty()
class ViewURL(webapp2.RequestHandler):
def get(self):
import urllib2
url = "http://iitk.ac.in/"
try:
result = urllib2.urlopen(url)
except urllib2.URLError, e:
handleError(e)
content = result.read()
e = URLEntry(key_name=url,content=content)
URLEntry.get_or_insert(url,content=content) #Probably this line generates the error.
抛出错误:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe0 in position 25554: ordinal not in range(128)
追溯:
'ascii' codec can't decode byte 0xe0 in position 25554: ordinal not in range(128)
Traceback (most recent call last):
File "/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1511, in __call__
rv = self.handle_exception(request, response, e)
File "/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1505, in __call__
rv = self.router.dispatch(request, response)
File "/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1253, in default_dispatcher
return route.handler_adapter(request, response)
File "/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1077, in __call__
return handler.dispatch()
File "/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 547, in dispatch
return self.handle_exception(e, self.app.debug)
File "/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 545, in dispatch
return method(*args, **kwargs)
File "/base/data/home/apps/s~govt-jobs/1.368125505627581007/checkforurls.py", line 83, in get
URLEntry.get_or_insert(url,content=result.content)
File "/python27_runtime/python27_lib/versions/1/google/appengine/ext/db/__init__.py", line 1362, in get_or_insert
return run_in_transaction(txn)
File "/python27_runtime/python27_lib/versions/1/google/appengine/api/datastore.py", line 2461, in RunInTransaction
return RunInTransactionOptions(None, function, *args, **kwargs)
File "/python27_runtime/python27_lib/versions/1/google/appengine/api/datastore.py", line 2599, in RunInTransactionOptions
ok, result = _DoOneTry(new_connection, function, args, kwargs)
File "/python27_runtime/python27_lib/versions/1/google/appengine/api/datastore.py", line 2621, in _DoOneTry
result = function(*args, **kwargs)
File "/python27_runtime/python27_lib/versions/1/google/appengine/ext/db/__init__.py", line 1359, in txn
entity = cls(key_name=key_name, **kwds)
File "/python27_runtime/python27_lib/versions/1/google/appengine/ext/db/__init__.py", line 970, in __init__
prop.__set__(self, value)
File "/python27_runtime/python27_lib/versions/1/google/appengine/ext/db/__init__.py", line 614, in __set__
value = self.validate(value)
File "/python27_runtime/python27_lib/versions/1/google/appengine/ext/db/__init__.py", line 2798, in validate
value = self.data_type(value)
File "/python27_runtime/python27_lib/versions/1/google/appengine/api/datastore_types.py", line 1163, in __new__
return super(Text, cls).__new__(cls, arg, encoding)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe0 in position 25554: ordinal not in range(128)
同样正如其他 StackOverflow 答案所建议的那样,我在尝试插入数据存储之前尝试添加以下内容:
content = content.decode("ISO-8859-1") # The encoding of the page is ISO-8859-1
content = content.encode("utf-8")
但错误占上风。请帮忙。