1

我正在尝试捕获错误 60 并继续执行我的脚本,这就是我目前正在做的事情:

import urllib2
import csv
from bs4 import BeautifulSoup


matcher = csv.reader(open('matcher.csv', "rb" ))

for i in matcher:
    url = i[1]
    if len(list(url)) > 0:
        print url
        try:
            soup = BeautifulSoup(urllib2.urlopen(url,timeout=10))   

        except urllib2.URLError, e:
            print ("There was an error: %r" % e)

它返回这个:

回溯(最后一次调用):文件“debug.py”,第 13 行,在 soup = BeautifulSoup(urllib2.urlopen(url,timeout=10)) 文件“/Library/Frameworks/Python.framework/Versions/2.7/lib /python2.7/urllib2.py”,第 126 行,在 urlopen 返回 _opener.open(url, data, timeout) 文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py ",第 400 行,打开响应 = self._open(req, data) 文件 "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py",第 418 行,在 _open '_open ', req) 文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py”,第 378 行,在 _call_chain 结果 = func(*args) 文件“/Library/Frameworks/Python .framework/Versions/2.7/lib/python2.7/urllib2.py”,第 1207 行,在 http_open 返回 self.do_open(httplib.HTTPConnection, req) 文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py”,第 1180 行,在 do_open r = h.getresponse( buffering=True)文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py”,第1030行,在getresponse response.begin()文件“/Library/Frameworks/Python.framework /Versions/2.7/lib/python2.7/httplib.py”,第 407 行,开始版本,状态,原因 = self._read_status() 文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2 .7/httplib.py”,第 365 行,在 _read_status 行 = self.fp.readline() 文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py”,第 447 行, 在 readline 数据 = self._sock.recv(self._rbufsize) socket.timeout: 超时

我将如何捕获此错误并“继续”?

4

2 回答 2

4

您可以导入异常对象并修改您的except块:

import socket

try:
    soup = BeautifulSoup(urllib2.urlopen(url,timeout=10))   

except urllib2.URLError as e:
    print ("There was an error: %r" % e)
except socket.timeout as e: # <-------- this block here
    print "We timed out"

更新:嗯,学到了一些新东西——刚刚找到了一个.reason属性的引用:

except urllib2.URLError as e:
    if isinstance(e.reason, socket.timeout):
        pass # ignore this one
    else:
        # do stuff re other errors if you can...
        raise # otherwise propagate the error
于 2012-11-08T07:43:07.133 回答
1

您可以尝试except Exception as e:捕获所有错误。但是请记住,这会捕获所有错误,如果您只想捕获特定错误,则应避免这样做。

编辑: 您可以通过执行以下操作检查异常类型:

except Exception as e:
    exc_type, exc_obj, exc_tb = sys.exc_info()
    fname = os.path.split(exc_tb.tb_frame.f_code.co_filename)[1]      
    print(exc_type, fname, exc_tb.tb_lineno)
于 2012-11-08T07:54:59.210 回答