0

我有一个长时间运行的进程,大约每三天崩溃一次,因为在 HTTP 连接期间,在建立连接之后但在接收到任何数据之前,httplib 会抛出 BadStatusLine。我已经尝试包装我的调用,但异常只是在堆栈跟踪中引起并无论如何都会停止该过程。

#supporting code included for clarity
from httplib import BadStatusLine, HTTPException
import eventlet
sem = eventlet.semaphore.Semaphore(SIMULTENEOUS)

#problem code, running in one of many qthreads downloading various pages.
try:
    sem.acquire()
    eventlet.sleep(HIT_DELAY)
    lphtml = urllib2.urlopen(list_page_url).read()
    sem.release()
except (urllib2.URLError, urllib2.HTTPError, HTTPException, BadStatusLine) as e:
    sem.release()
    pipe.log.error("Could not download product list page %s\n%s" % (str(e), list_page_url))
    continue

我正在使用信号量,因为我不希望我的代码每秒访问站点超过一次,(但由于代码中的其他原因,我不想摆脱 eventlet。

最终,对 urllib2.urlopen 的调用将抛出 BadStatusLine,但不会被捕获,信号量也永远不会被释放。这是产生的堆栈跟踪。

Traceback (most recent call last):
  File "/usr/local/lib/python2.6/dist-packages/eventlet-0.9.16-py2.6.egg/eventlet/greenpool.py", line 80, in _spawn_n_impl
    func(*args, **kwargs)
  File "/home/myself/secret_filename.py", line 52, in poll_feed_hourly
    lphtml = urllib2.urlopen(list_page_url).read()
  File "/usr/lib/python2.6/urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python2.6/urllib2.py", line 391, in open
    response = self._open(req, data)
  File "/usr/lib/python2.6/urllib2.py", line 409, in _open
    '_open', req)
  File "/usr/lib/python2.6/urllib2.py", line 369, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.6/urllib2.py", line 1170, in http_open
    return self.do_open(httplib.HTTPConnection, req)
  File "/usr/lib/python2.6/urllib2.py", line 1143, in do_open
    r = h.getresponse()
  File "/usr/lib/python2.6/httplib.py", line 990, in getresponse
    response.begin()
  File "/usr/lib/python2.6/httplib.py", line 391, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python2.6/httplib.py", line 355, in _read_status
    raise BadStatusLine(line)
BadStatusLine

会不会是我对 qthreads 的奇怪使用导致 BadStatusLine 永远无法到达 catch 语句?有没有什么地方可以插入超时来导致最终到达 except 块?

4

2 回答 2

0

如果唯一的问题是释放信号量,为什么不使用try/finally语法?

try:
    sem.acquire()
    eventlet.sleep(HIT_DELAY)
    lphtml = urllib2.urlopen(list_page_url).read()
finally:
    sem.release()
于 2013-01-23T19:46:08.280 回答
0

尝试使用

from eventlet.green.httplib import HTTPException

代替

from httplib import BadStatusLine, HTTPException

注意: httplib.BadStatusLine 是 httplib.HTTPException ( http://docs.python.org/2/library/httplib.html#httplib.BadStatusLine ) 的子类,所以 BadStatusLine 也会被捕获。

于 2014-01-30T10:02:12.773 回答