Python 中目前是否有支持 HTTPS 代理进行网络抓取的东西?我目前在 Windows 上使用 Python 2.7,但如果 Python 3 支持 HTTPS 代理协议,我可以使用它。
我尝试使用 mechanize 和 requests 但都在 HTTPS 代理协议上失败。
这个位正在使用mechanize
:import mechanize
br = mechanize.Browser()
br.set_debug_http(True)
br.set_handle_robots(False)
br.set_proxies({
"http" : "ncproxy1.uk.net.intra:8080",
"https" : "ncproxy1.uk.net.intra:8080",})
br.add_proxy_password("uname", "pass")
br.open("http://www.google.co.jp/") # OK
br.open("https://www.google.co.jp/") # Proxy Authentication Required
或使用requests
:
import requests
from requests.auth import HTTPProxyAuth
proxyDict = {
'http' : 'ncproxy1.uk.net.intra:8080',
'https' : 'ncproxy1.uk.net.intra:8080'
}
auth = HTTPProxyAuth('uname', 'pass')
r = requests.get("https://www.google.com", proxies=proxyDict, auth=auth)
print r.text
我收到以下消息:
Traceback (most recent call last):
File "D:\SRC\NuffieldLogger\NuffieldLogger\nuffieldrequests.py", line 10, in <module>
r = requests.get("https://www.google.com", proxies=proxyDict, auth=auth)
File "C:\Python27\lib\site-packages\requests\api.py", line 55, in get
return request('get', url, **kwargs)
File "C:\Python27\lib\site-packages\requests\api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Python27\lib\site-packages\requests\sessions.py", line 335, in request
resp = self.send(prep, **send_kwargs)
File "C:\Python27\lib\site-packages\requests\sessions.py", line 438, in send
r = adapter.send(request, **kwargs)
File "C:\Python27\lib\site-packages\requests\adapters.py", line 331, in send
raise SSLError(e)
requests.exceptions.SSLError: [Errno 1] _ssl.c:504: error:140770FC:SSL routines:SSL23_GET_SERVER_HELLO:unknown protocol