2

要求:Python 2.7,没有外部库,如 Requests 或 BeautifulSoup :(

当我调用这个 url 时,我从 retrieveUrl 的回溯中得到错误:
u'http://%E7%9F%A5%E3%81%A3%E5%BE%97%E8%A2%8B.biz/wp-content/uploads/2016/10/104743-300x225.jpg'

如您所见,我的服务器已经为我提供了很好的 uriencoded 准备好的 url,但它仍然崩溃。

def retrieveUrl(url):
    req = urllib2.Request(url, None, {'User-Agent': 'Mozilla/5.0 (compatible; Anki)'})
    filecontents = urllib2.urlopen(req).read()
    path = unicode(urllib2.unquote(url.encode("utf8")), "utf8")
    filename, file_extension = os.path.splitext(path)
    return filename, file_extension, filecontents

错误回溯

    Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python27\lib\urllib2.py", line 154, in urlopen
    return opener.open(url, data, timeout)
  File "C:\Python27\lib\urllib2.py", line 431, in open
    response = self._open(req, data)
  File "C:\Python27\lib\urllib2.py", line 449, in _open
    '_open', req)
  File "C:\Python27\lib\urllib2.py", line 409, in _call_chain
    result = func(*args)
  File "C:\Python27\lib\urllib2.py", line 1227, in http_open
    return self.do_open(httplib.HTTPConnection, req)
  File "C:\Python27\lib\urllib2.py", line 1194, in do_open
    h.request(req.get_method(), req.get_selector(), req.data, headers)
  File "C:\Python27\lib\httplib.py", line 1057, in request
    self._send_request(method, url, body, headers)
  File "C:\Python27\lib\httplib.py", line 1097, in _send_request
    self.endheaders(body)
  File "C:\Python27\lib\httplib.py", line 1053, in endheaders
    self._send_output(message_body)
  File "C:\Python27\lib\httplib.py", line 897, in _send_output
    self.send(msg)
  File "C:\Python27\lib\httplib.py", line 859, in send
    self.connect()
  File "C:\Python27\lib\httplib.py", line 836, in connect
    self.timeout, self.source_address)
  File "C:\Python27\lib\socket.py", line 557, in create_connection
    for res in getaddrinfo(host, port, 0, SOCK_STREAM):
  File "C:\Python27\lib\encodings\idna.py", line 164, in encode
    result.append(ToASCII(label))
  File "C:\Python27\lib\encodings\idna.py", line 76, in ToASCII
    label = nameprep(label)
  File "C:\Python27\lib\encodings\idna.py", line 38, in nameprep
    raise UnicodeError("Invalid character %r" % c)
UnicodeError: Invalid character u'\x9f'

我什至没有弄清楚 u'\x9f' 是什么字符。有什么想法可以修复该功能以获取文件内容吗?

4

0 回答 0