2
eduardo@camizao:/$ python2.7 
Python 2.7.3 (default, Sep 26 2013, 20:03:06) 
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib
>>> url1 = 'http://www.google.com'
>>> url2 = 'https://www.google.com'
>>> f = urllib.urlopen(url1) 
>>> f = urllib.urlopen(url2)
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "/usr/lib/python2.7/urllib.py", line 87, in urlopen
  return opener.open(url)
 File "/usr/lib/python2.7/urllib.py", line 211, in open
  return getattr(self, name)(url)
 File "/usr/lib/python2.7/urllib.py", line 355, in open_http
  'got a bad status line', None)
IOError: ('http protocol error', 0, 'got a bad status line', None)
>>> 

当我尝试使用 urllib 连接到 https 站点时,出现上述错误。代理设置正确。调试 python 代码时,我注意到在 urllib.py 中没有执行对 ssl 库的导入。因此,也不会执行 https 调用。任何人都可以帮助我吗?我必须使用 urllib,而不是 urllib2 或另一个。提前致谢。

4

1 回答 1

0

至少您编写它的方式没有问题:

$ python
Python 2.7.4 (default, Sep 26 2013, 03:20:26) 
[GCC 4.7.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib
>>> url1 = 'http://www.google.com'
>>> url2 = 'https://www.google.com'
>>> f = urllib.urlopen(url1)
>>> f = urllib.urlopen(url2)
>>> f.read()[:15]
'<!doctype html>'
>>>

所以事实并非如此。那么它必须与您的环境或配置有关。你说你使用代理?

编辑:

我可以通过开放代理打开它(不包括所述代理,因为谁知道它是否粗略 - 用您自己的代理代替:

$ python
Python 2.7.4 (default, Sep 26 2013, 03:20:26) 
[GCC 4.7.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib2
>>> proxy_handler = urllib2.ProxyHandler({'http': 'http://some-sketchy-open-proxy'})
>>> opener = urllib2.build_opener(proxy_handler)
>>> opener.open('https://www.google.com')
<addinfourl at 140512985881056 whose fp = <socket._fileobject object at 0x7fcbba9b1ed0>>
>>> _.read()[:15]
'<!doctype html>'
>>> 

用你自己的代理 URL 试试这种方式(注意我使用的是 urllib2,而不是 urllib)。希望有帮助!

编辑 2

仅使用 urllib:

$ python
Python 2.7.4 (default, Sep 26 2013, 03:20:26) 
[GCC 4.7.3] on linux2
Type "copyright", "credits" or "license()" for more information.
>>> import urllib
>>> proxies = {'http': '189.112.3.87:3128'}
>>> url = 'https://www.google.com'
>>> filehandle = urllib.urlopen(url,proxies=proxies)
>>> filehandle.read()[:15]
'<!doctype html>'
>>> 
于 2013-10-31T19:11:06.523 回答