我尝试使用以下scrapy的蜘蛛从https://www.marinetraffic.com/en/ais/details/ships/imo:9829069/提取数据,然后将响应保存到file.html。
# -*- coding: utf-8 -*-
import scrapy
from fake_useragent import UserAgent
class MarinetrafficSpider(scrapy.Spider):
name = 'marinetraffic'
allowed_domains = ['marinetraffic.com']
ua = UserAgent()
ua.update()
def start_requests(self):
urls = [
'https://www.marinetraffic.com/en/ais/details/ships/imo:9829069/'
]
headers= {'User-Agent': self.ua['google chrome'] }
for url in urls:
yield scrapy.Request(url, callback=self.parse, headers=headers)
def parse(self, response):
with open('file.html', 'wb') as f:
f.write(response.body)
self.log('Saved file')
但我不接受预期的反应。返回的响应在file.html中
请检查调试结果。
我需要对上述代码进行哪些修改,以使返回的响应与我从浏览器获取的响应相同?
我会通知你的笔记。