我正在尝试使用 Python 2.7 和 BeautifulSoup 抓取网页,但我无法克服对我来说没有多大意义的协议错误。这只发生在我需要这样做的特定网站上:https ://edd.telstra.com/telstra
我仅用于基本测试的代码:
#! /usr/bin/python
from urllib import urlopen
from BeautifulSoup import BeautifulSoup
import re
# Copy all of the content from the provided web page
webpage = urlopen("https://edd.telstra.com/telstra/").read()
我收到以下错误(在 Ubuntu 12.10 上运行):
Traceback (most recent call last):
File "e.py", line 8, in <module>
webpage = urlopen("https://edd.telstra.com/telstra/").read()
File "/usr/lib/python2.7/urllib.py", line 86, in urlopen
return opener.open(url)
File "/usr/lib/python2.7/urllib.py", line 207, in open
return getattr(self, name)(url)
File "/usr/lib/python2.7/urllib.py", line 436, in open_https
h.endheaders(data)
File "/usr/lib/python2.7/httplib.py", line 958, in endheaders
self._send_output(message_body)
File "/usr/lib/python2.7/httplib.py", line 818, in _send_output
self.send(msg)
File "/usr/lib/python2.7/httplib.py", line 780, in send
self.connect()
File "/usr/lib/python2.7/httplib.py", line 1165, in connect
self.sock = ssl.wrap_socket(sock, self.key_file, self.cert_file)
File "/usr/lib/python2.7/ssl.py", line 381, in wrap_socket
ciphers=ciphers)
File "/usr/lib/python2.7/ssl.py", line 143, in __init__
self.do_handshake()
File "/usr/lib/python2.7/ssl.py", line 305, in do_handshake
self._sslobj.do_handshake()
IOError: [Errno socket error] [Errno 1] _ssl.c:504: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac
有人可以告诉我是否需要指定一些参数才能让这个页面在 Python 中下载?似乎这只是这个网页上的问题,因为上面的代码(加上我尝试过的许多其他代码)在我尝试过的其他 HTTPS/SSL 页面上运行良好。
谢谢你的帮助!