0

有什么方法可以让我在蜘蛛中拥有持久的请求元数据?request.meta只持续到下一个回调,我必须做这样的事情:

def method1(self, response):
    request = Request(url, callback=self.method2)
    request.meta['persist'] = ...

    yield request

def method2(self, response):
    ...

    request = Request(url, callback=self.method3)
    request.meta['persist'] = response.meta['persist']

    yield request

我也做了一个装饰器,但我真的希望有一个更清洁的解决方案:

def persist_meta(callback):
    def inner(self, *args, **kwargs):
        for result in callback(self, *args, **kwargs):
            if isinstance(result, Request):
                response = args[0]

                persist = response.meta.get('persist', {})
                persist.update(result.meta.get('persist', {})

                result.meta['persist'] = persist

            yield result

    return inner

任何帮助表示赞赏。

4

1 回答 1

1

创建一个新的中间件并将您的代码保存在process_spider_input.

于 2013-03-11T16:25:25.453 回答