我开始学习scrapy,想知道如何在excel文件中按州获取每所学校的信息。每个状态都是指向另一个页面的链接,我不确定如何为此编写 xpath 语法。请指教。
https://www.raise.me/high-school
import scrapy
class RaisemeSpider(scrapy.Spider):
name = 'raiseme'
allowed_domains = ['raise.me/high-school']
start_urls = ['http://raise.me/high-school/']
def parse(self, response):
h1_tag = response.xpath('//h1/text()').extract_first()
yield {'H1 Tag': h1_tag }