0

我正在使用tornado使用HTTP proxy异步获取许多网页。因此,我的许多获取都发生了错误(我的代理不可靠)。我想用另一个代理立即重试它们。这是示例:

from tornado import ioloop
from tornado import httpclient

def handle_request(response):
    if response.error:
        print "Error:", response.error
        // HERE i want to put my retry with another proxy
    else:
        print response.body
    ioloop.IOLoop.instance().stop()

http_client = httpclient.AsyncHTTPClient()
http_client.fetch("http://www.google.com/", handle_request)
ioloop.IOLoop.instance().start()

但是我怎样才能从handle_request向当前循环添加新的 fetch 事件?另外,我如何将变量传递给handle_request(列出我所有的代理)。

4

1 回答 1

2

你问了两个问题——

我会考虑使用部分http://docs.python.org/library/functools.html#partial-objects

from functools import partial

PROXIES = [A, B, C, D] # As appropriate
...
def handle_request(proxies, response):
    if ...BAD RESPONSE...:
        return http_client.fetch(response.request.url, partial(handle_request, proxies[1:]))
    # Now handle the case that you have a good result or you're out of proxies

http_client.fetch("http://www.google.com/", partial(handle_request, PROXIES[:]))

当然,另一个选择是让它成为一个对象。

class ProxyRequest(object):
     PROXIES = [A, B, C]

     def __init__(self, url):
          self.url = url
          self.proxies = self.PROXIES[:]
          self.fetch()

     def fetch(self):
          p, self.proxies = self.proxies[0], self.proxies[1:]

          http_client.fetch(self.url, self.handle, proxy=p)

     def handle(self, response):
          if response.error:
               if self.proxies:
                     return self.fetch()
               else:
                     ...error case...

          ...stop the ioloop if you want...
于 2012-06-12T18:05:50.347 回答