python - 使用 urllib 2 捕获错误 60（超时）

Question

我正在尝试捕获错误 60 并继续执行我的脚本，这就是我目前正在做的事情：

import urllib2
import csv
from bs4 import BeautifulSoup


matcher = csv.reader(open('matcher.csv', "rb" ))

for i in matcher:
    url = i[1]
    if len(list(url)) > 0:
        print url
        try:
            soup = BeautifulSoup(urllib2.urlopen(url,timeout=10))   

        except urllib2.URLError, e:
            print ("There was an error: %r" % e)

它返回这个：

回溯（最后一次调用）：文件“debug.py”，第 13 行，在 soup = BeautifulSoup(urllib2.urlopen(url,timeout=10)) 文件“/Library/Frameworks/Python.framework/Versions/2.7/lib /python2.7/urllib2.py”，第 126 行，在 urlopen 返回 _opener.open(url, data, timeout) 文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py "，第 400 行，打开响应 = self._open(req, data) 文件 "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py"，第 418 行，在 _open '_open ', req) 文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py”，第 378 行，在 _call_chain 结果 = func(*args) 文件“/Library/Frameworks/Python .framework/Versions/2.7/lib/python2.7/urllib2.py”，第 1207 行，在 http_open 返回 self.do_open(httplib.HTTPConnection, req) 文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py”，第 1180 行，在 do_open r = h.getresponse( buffering=True）文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py”，第1030行，在getresponse response.begin（）文件“/Library/Frameworks/Python.framework /Versions/2.7/lib/python2.7/httplib.py”，第 407 行，开始版本，状态，原因 = self._read_status() 文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2 .7/httplib.py”，第 365 行，在 _read_status 行 = self.fp.readline() 文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py”，第 447 行, 在 readline 数据 = self._sock.recv(self._rbufsize) socket.timeout: 超时

我将如何捕获此错误并“继续”？

score 4 · Accepted Answer

您可以导入异常对象并修改您的except块：

import socket

try:
    soup = BeautifulSoup(urllib2.urlopen(url,timeout=10))   

except urllib2.URLError as e:
    print ("There was an error: %r" % e)
except socket.timeout as e: # <-------- this block here
    print "We timed out"

更新：嗯，学到了一些新东西——刚刚找到了一个.reason属性的引用：

except urllib2.URLError as e:
    if isinstance(e.reason, socket.timeout):
        pass # ignore this one
    else:
        # do stuff re other errors if you can...
        raise # otherwise propagate the error

score 1 · Accepted Answer

您可以尝试except Exception as e:捕获所有错误。但是请记住，这会捕获所有错误，如果您只想捕获特定错误，则应避免这样做。

编辑： 您可以通过执行以下操作检查异常类型：

except Exception as e:
    exc_type, exc_obj, exc_tb = sys.exc_info()
    fname = os.path.split(exc_tb.tb_frame.f_code.co_filename)[1]      
    print(exc_type, fname, exc_tb.tb_lineno)

python - 使用 urllib 2 捕获错误 60（超时）

2 回答 2

Related

Reference