意图:
我的目标是从 whoscored.com 上抓取足球数据。匹配页面(参见此处的示例)包含一个时间线视图(div id = match-center-timeline),该视图依次具有时间线句柄(div class = 时间线句柄)。可以拖放手柄以设置时间线以显示匹配统计信息。例如,我想将时间线的下限设置为 10 分钟,将上限设置为 30 分钟。
设置和假设:
我在 OSX 10 上使用 selenium 和 chrome。下面的代码移动了较低的时间线句柄,但抛出了 StaleElementReferenceException。我的信念是页面应用程序会因突然的拖放而感到困惑。
问题:
有什么方法可以模拟较慢的鼠标拖动?这甚至是问题吗?如何克服?提前致谢!
Python 3 代码:
url = "https://www.whoscored.com/Matches/1190514/Live"
time_to_wait = 20
wanted_element_id = "match-centre-timeline"
opts = ChromeOptions()
opts.add_experimental_option("detach", True)
opts.add_argument("disable-infobars")
opts.add_argument("disable-notifications")
driver = webdriver.Chrome(executable_path="/Users/david/Dropbox/Code/gitCode/driver/chromedriver", chrome_options=opts)
driver.set_page_load_timeout(time_to_wait)
example_loaded = bool(True)
while example_loaded:
try:
driver.get(url)
WebDriverWait(driver,time_to_wait).until(EC.presence_of_element_located((By.ID, wanted_element_id)))
print("[STATUS]\tFound " + wanted_element_id + ".")
example_loaded= False
except TimeoutException:
print("[WARNING]\tCannot find the " + wanted_element_id + ". Retrying.")
tlh = driver.find_element_by_xpath("//*[@id='match-centre-timeline']/div[1]/div[2]/div[2]/div[2]/div[1]/span")
tuh = driver.find_element_by_xpath("//*[@id='match-centre-timeline']/div[1]/div[2]/div[2]/div[2]/div[2]/span")
driver.execute_script("arguments[0].scrollIntoView();", tlh)
actions = ActionChains(driver)
# move to the lower timeline handle
actions.move_to_element(tlh)
actions.perform()
print("[NOTE]\t\tLower handle stats: x:{}, y:{} and h:{}px, w:{}px.".format( tlh.location['x'], tlh.location['y'], tlh.size['height'], tlh.size['width'] ))
print("[NOTE]\t\tUpper handle stats: x:{}, y:{} and h:{}px, w:{}px.".format( tuh.location['x'], tuh.location['y'], tuh.size['height'], tuh.size['width'] ))
# let's click on the lower timeline handle
try:
actions.click_and_hold(tlh)
actions.drag_and_drop_by_offset(tlh, 100, 0)
actions.perform()
print("[NOTE]\t\tLower handle stats: x:{}, y:{} and h:{}px, w:{}px.".format( tlh.location['x'], tlh.location['y'], tlh.size['height'], tlh.size['width'] ))
print("[NOTE]\t\tUpper handle stats: x:{}, y:{} and h:{}px, w:{}px.".format( tuh.location['x'], tuh.location['y'], tuh.size['height'], tuh.size['width'] ))
except StaleElementReferenceException:
print("[WARN]\tThrew and caught a StaleElementReferenceException.")
控制台错误:
[STATUS] Found match-centre-timeline.
[NOTE] Lower handle stats: x:118, y:1341 and h:41px, w:19px.
[NOTE] Upper handle stats: x:869, y:1341 and h:41px, w:30px.
[WARN] Threw and caught a StaleElementReferenceException
See anything?
Traceback (most recent call last):
File "whoscored-scrape.py", line 442, in <module>
print("[NOTE]\t\tLower handle stats: x:{}, y:{} and h:{}px, w:{}px.".format( tlh.location['x'], tlh.location['y'], tlh.size['height'], tlh.size['width'] ))
File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/selenium/webdriver/remote/webelement.py", line 404, in location
old_loc = self._execute(Command.GET_ELEMENT_LOCATION)['value']
File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/selenium/webdriver/remote/webelement.py", line 493, in _execute
return self._parent.execute(command, params)
File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 256, in execute
self.error_handler.check_response(response)
File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/selenium/webdriver/remote/errorhandler.py", line 194, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: element is not attached to the page document
(Session info: chrome=65.0.3325.181)(Driver info: chromedriver=2.36.540469 (1881fd7f8641508feb5166b7cae561d87723cfa8),platform=Mac OS X 10.12.6 x86_64)
支持图片: