0
from urllib import urlopen
from bs4 import BeautifulSoup
import re

# Copy all of the content from the provided web page
webpage = urlopen('http://stats.espncricinfo.com/indian-premier-league-2012/engine/records/averages/batting.html?id=6680;type=tournament').read()

soup=BeautifulSoup(webpage);

commentary=soup.find_all("tr", "data2");

for i in range(10):

    for stat in commentary[i].stripped_strings:
        print stat,

    print ""

我在 Eclipse 中运行这个 python 程序。我已更改网络连接中的代理条目。但我得到 IOError 如下:

IOError: [Errno socket error] [Errno -2] Name or service not known

回溯(最近一次通话最后):

网页中的文件“/home/sumanth/workspace/python/scraping.py”,第 22 行 = urlopen(' http://stats.espncricinfo.com/indian-premier-league-2012/engine/records/averages/batting .html?id=6680;type=tournament ').read()

文件“/usr/lib/python2.7/urllib.py”,第 86 行,在 urlopen 返回 opener.open(url)

文件“/usr/lib/python2.7/urllib.py”,第 207 行,打开返回 getattr(self, name)(url)

文件“/usr/lib/python2.7/urllib.py”,第 344 行,在 open_http h.endheaders(data)

文件“/usr/lib/python2.7/httplib.py”,第 958 行,在 endheaders self._send_output(message_body)

_send_output self.send(msg) 中的文件“/usr/lib/python2.7/httplib.py”,第 818 行

文件“/usr/lib/python2.7/httplib.py”,第 780 行,发送 self.connect()

文件“/usr/lib/python2.7/httplib.py”,第 761 行,在连接 self.timeout,self.source_address)

文件“/usr/lib/python2.7/socket.py”,第 571 行,在 create_connection 中引发错误

IOError:[Errno 套接字错误] [Errno 110] 连接超时

4

1 回答 1

1

看起来你有一个不稳定的互联网连接。错误“名称或服务未知”表示该页面的 DNS 查找失败,“连接超时错误”表示您无法联系远程服务器,但 DNS 查找成功。

于 2013-04-19T11:49:34.747 回答