python - 如何从 Python 框架 Scrapy 中的页面解析 RSS 链接（获取 ulr 到 RSS）？

Question

我想解析 Google 搜索并从搜索结果中的每个项目中获取指向 RSS 的链接。我使用 Scrapy。我试过这个结构，

...
def parse_second(self, response):
    hxs = HtmlXPathSelector(response)
    qqq = hxs.select('/html/head/link[@type=application/rss+xml]/@href').extract()
    print qqq
    item = response.request.meta['item']
    if len(qqq) > 0:
        item['rss'] = qqq.pop()
    else:
        item['rss'] = ''    
    yield item
...

但是“打印qqq”给了我

[]

score 1 · Accepted Answer

发现一个错误：

qqq = hxs.select("/html/head/link[@type='application/rss+xml']/@href").extract()

这样可行

python - 如何从 Python 框架 Scrapy 中的页面解析 RSS 链接（获取 ulr 到 RSS）？

1 回答 1

Related

Reference