from urllib import urlopen
from bs4 import BeautifulSoup
import re
# Copy all of the content from the provided web page
webpage = urlopen('http://stats.espncricinfo.com/indian-premier-league-2012/engine/records/averages/batting.html?id=6680;type=tournament').read()
soup=BeautifulSoup(webpage);
commentary=soup.find_all("tr", "data2");
for i in range(10):
for stat in commentary[i].stripped_strings:
print stat,
print ""
我在 Eclipse 中运行这个 python 程序。我已更改网络连接中的代理条目。但我得到 IOError 如下:
IOError: [Errno socket error] [Errno -2] Name or service not known
回溯(最近一次通话最后):
网页中的文件“/home/sumanth/workspace/python/scraping.py”,第 22 行 = urlopen(' http://stats.espncricinfo.com/indian-premier-league-2012/engine/records/averages/batting .html?id=6680;type=tournament ').read()
文件“/usr/lib/python2.7/urllib.py”,第 86 行,在 urlopen 返回 opener.open(url)
文件“/usr/lib/python2.7/urllib.py”,第 207 行,打开返回 getattr(self, name)(url)
文件“/usr/lib/python2.7/urllib.py”,第 344 行,在 open_http h.endheaders(data)
文件“/usr/lib/python2.7/httplib.py”,第 958 行,在 endheaders self._send_output(message_body)
_send_output self.send(msg) 中的文件“/usr/lib/python2.7/httplib.py”,第 818 行
文件“/usr/lib/python2.7/httplib.py”,第 780 行,发送 self.connect()
文件“/usr/lib/python2.7/httplib.py”,第 761 行,在连接 self.timeout,self.source_address)
文件“/usr/lib/python2.7/socket.py”,第 571 行,在 create_connection 中引发错误
IOError:[Errno 套接字错误] [Errno 110] 连接超时