0

案例:我正在尝试从网站中提取页数数据。我使用以下代码在页面中创建了一个过滤器:

 fp = webdriver.FirefoxProfile()
 fp.set_preference("javascript.enabled", True)
 b = webdriver.Firefox(firefox_profile=fp)
 b.get(url)
 time.sleep(10)
 search = b.find_element_by_name("rb")
 search.clear()
 search.send_keys('dove')
 search.send_keys(Keys.ESCAPE)
 search.submit()
 shampoo_sel = b.find_element_by_id('flt-46')
 shampoo_sel.click()
 conditioner_sel = b.find_element_by_id('flt-47')
 conditioner_sel.click()
 time.sleep(5)
 search_url = b.current_url
 dp = urllib2.urlopen(search_url).read()
 dp_soup = BeautifulSoup(dp)
 search_page_num = dp_soup.find("li", { "id" : "pagContinue" })
 print search_page_num

虽然我尝试使用当前 URL 保存代码(过滤器之前和之后的 URL 都相同,因此无法获得过滤器后的确切页数)在这种情况下我该怎么办?

4

0 回答 0