我正在尝试学习 Scrapy 网络爬虫并使用分类汽车网站作为主题,以检查对策。我知道 X-AjaxPro-Method 存在,因为 Chrome 开发人员工具显示了正在传递的标头并收到了正确的响应。但是当在 Scrapy shell 中完成时,我得到“这个方法要么没有用 AjaxMethod 标记,要么不可用。”
以下是使用的 shell 命令:
>>> from scrapy.http import FormRequest
>>>
request=FormRequest(url='https://www.carwale.com/ajaxpro/CarwaleAjax.AjaxClassifiedBuyer,Carwale.ashx',headers={"X-AjaxPro-Method":"ProcessUsedCarPurchaseInquiry","Content-Type":"application/x-www-form-urlencoded; charset=UTF-8","X-Requested-With":"XMLHttpRequest"},formdata={"profileId":"D1249107","buyerName":"","buyerEmail":"","buyerMobile":"9938223299","carModel":"","makeYear":"","pageUrl":"https://www.carwale.com/used/cars-in-karnal/chevrolet-enjoy-d1249107/?rk","isP":"False","transToken":"","ltsrc":"","buyerSourceId":"4","comments":"","cwc":"buJNfItyQKBP8a3OahoJsOOmg","utma":"\"52149691.1076750176.1492103717.1492447801.1492447801.8\"","utmz":"\"52149691.1492103720.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)\"","originId":"3","isFromCaptcha":"","isGSDClick":"","isRecommended":"","isCertificationDownload":""})
>>> fetch(request)
2017-04-18 08:45:32 [scrapy.core.engine] DEBUG: Crawled (200) <POST https://www.carwale.com/ajaxpro/CarwaleAjax.AjaxClassifiedBuyerCarwale,Carwale.ashx> (referer: None)
>>> print(response.body)
{"error":{"Message":"This method is either not marked with an AjaxMethod or is not available.","Type":"System.NotSupportedException"}}
>>>
原始页面位于https://www.carwale.com/used/cars-in-karnal/chevrolet-enjoy-d1249107/?rk=69&isP=false并且必须输入手机号码才能获得“卖家细节。”
所以,我已经深入挖掘了一些,并将分享更多信息。我已经能够使用浏览器中的开发人员工具将 XHR 作为 curl 命令导出,然后对其进行修剪,以便在我看来,唯一需要的标头是 X-AjaxPro-Method,因为 curl 命令仅适用于此标题和数据。
还可以使用 Python requests 库使其工作。