如何使用 Scrapy 抓取多个 URL?
我是否被迫制作多个爬虫?
class TravelSpider(BaseSpider):
name = "speedy"
allowed_domains = ["example.com"]
start_urls = ["http://example.com/category/top/page-%d/" % i for i in xrange(4),"http://example.com/superurl/top/page-%d/" % i for i in xrange(55)]
def parse(self, response):
hxs = HtmlXPathSelector(response)
items = []
item = TravelItem()
item['url'] = hxs.select('//a[@class="out"]/@href').extract()
out = "\n".join(str(e) for e in item['url']);
print out
蟒蛇说:
NameError: name 'i' is not defined
但是当我使用一个 URL 时它工作正常!
start_urls = ["http://example.com/category/top/page-%d/" % i for i in xrange(4)"]