0

我在 middlewares.py 中引用了以下代码我试图在每个请求中更改我在 TOR 中的 ip

def _set_new_ip():
    with Controller.from_port(port=9051) as controller:
        controller.authenticate(password='tor_password')
        controller.signal(Signal.NEWNYM)

class RandomUserAgentMiddleware(object):
    def process_request(self, request, spider):
        ua  = random.choice(settings.get('USER_AGENT_LIST'))
        if ua:
            request.headers.setdefault('User-Agent', ua)

class ProxyMiddleware(object):
    def process_request(self, request, spider):
        _set_new_ip()
        request.meta['proxy'] = 'http://127.0.0.1:8118'
        spider.log('Proxy : %s' % request.meta['proxy'])

但是当我尝试开始在scrapy中爬行时,它不断返回我以下信息:

2017-09-10 22:36:44 [scrapy.extensions.telnet] DEBUG: Telnet console listening on 127.0.0.1:6023
2017-09-10 22:36:44 [stem] DEBUG: GETCONF __owningcontrollerprocess (runtime: 0.0004)
2017-09-10 22:36:44 [stem] INFO: Error while receiving a control message (SocketClosed): empty socket content
2017-09-10 22:36:44 [IT] DEBUG: Proxy : http://127.0.0.1:8118
2017-09-10 22:36:44 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.jobstreet.com.sg/en/job-search/job-vacancy.php?ojs=10&key=information%20technology> (failed 1 times): Connection was refused by other side: 61: Connection refused.
2017-09-10 22:36:44 [stem] DEBUG: GETCONF __owningcontrollerprocess (runtime: 0.0003)
2017-09-10 22:36:44 [stem] INFO: Error while receiving a control message (SocketClosed): empty socket content
2017-09-10 22:36:44 [IT] DEBUG: Proxy : http://127.0.0.1:8118
2017-09-10 22:36:52 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.jobstreet.com.sg/en/job-search/job-vacancy.php?ojs=10&key=information%20technology> (failed 2 times): Connection was refused by other side: 61: Connection refused.
2017-09-10 22:36:52 [stem] DEBUG: GETCONF __owningcontrollerprocess (runtime: 0.0004)
2017-09-10 22:36:52 [stem] INFO: Error while receiving a control message (SocketClosed): empty socket content
2017-09-10 22:36:52 [IT] DEBUG: Proxy : http://127.0.0.1:8118
2017-09-10 22:36:56 [scrapy.downloadermiddlewares.retry] DEBUG: Gave up retrying <GET https://www.jobstreet.com.sg/en/job-search/job-vacancy.php?ojs=10&key=information%20technology> (failed 3 times): Connection was refused by other side: 61: Connection refused.
2017-09-10 22:36:56 [scrapy.core.scraper] ERROR: Error downloading <GET https://www.jobstreet.com.sg/en/job-search/job-vacancy.php?ojs=10&key=information%20technology>: Connection was refused by other side: 61: Connection refused.
2017-09-10 22:36:56 [scrapy.core.engine] INFO: Closing spider (finished)
4

0 回答 0