I am trying to crawl this page https://www.stickyguide.com/dispensaries/leaf-lab/ using scrapy. I am now having trouble crawling reviews from this page for a long time. If any one has any experience dealing with Ajax or Javascript, please share your thoughts.
1) I can easily get the Xpath for the review:
response.xpath('//*[@id="reviews_section"]/div')
However, I believe the review part of the page is loaded by javascript. Every time when I crawled this page, I got the following value of Xpath:
<Selector xpath='//*[@id="reviews_section"]/div' data=u'<div id="loader">\n<div class="loader"></'>
If there any method I can use to assure that scrapy crawls before javascript has been loaded? When I looked up the method online, using selenium package may be a solution, but it may be not efficient.
2) Another problem I met is that I only want to crawl the data from dispensaries. I need to choose the option "VIEW: Dispensary Only" from the dropdown menu next to the Review module. I took a look at the HTML code and it tends out to be an Ajax object.
<select id="sort" name="sort" onchange="new Ajax.Request('/update_reviews_section/2487', {asynchronous:true, evalScripts:true, parameters:'sort_by=' + $('sort').value + '&authenticity_token=' + encodeURIComponent('69BgJpnnj0tx/0lYwjIk75iFj0/l2R9EDj1No1FJX9o=')})">
If there any method I can use to request the content of the option "VIEW: Dispensary Only"? I have tried a lot of methods on stackoverflow but I still can't work this out.
Thank you in advance