-1

我正在尝试从下面的网站 https://www.morningstar.com/etfs/xnas/vnqi/portfolio抓取国家信息, 这需要单击该部分'Country'中的选择Exposure,然后通过 1、2、3 等移动. 使用该部分底部的箭头的页面。我尝试过的任何方法似乎都不起作用。有没有办法在 Python 中使用 selenium?

非常感谢!

这是我使用的代码:

    urlpage   = 'https://www.morningstar.com/etfs/xnas/vnqi/portfolio'
    driver = webdriver.Chrome(options=options, executable_path='D:\Python\Python38\chromedriver.exe')
    driver.get(urlpage)
    elements=WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, "//a[text()='Country']")))
    for elem in elements:
        elem.click()

这是错误消息:

TimeoutException                          

Traceback (most recent call last)  
<ipython-input-3-bf16ea3f65c0> in <module>  
    23 driver = webdriver.Chrome(options=options, executable_path='D:\Python\Python38\chromedriver.exe')  
     24 driver.get(urlpage)  
---> 25 elements=WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, "//a[text()='Country']")))  
     26 for elem in elements:  
     27      elem.click()  
D:\Anaconda\lib\site-packages\selenium\webdriver\support\wait.py in until(self, method, message)  
     78             if time.time() > end_time:  
     79                 break  
---> 80         raise TimeoutException(message, screen, stacktrace)  
     81   
     82     def until_not(self, method, message=''):  
TimeoutException: Message: 

抱歉,不知道如何更好地格式化错误消息。再次感谢。

4

1 回答 1

0

看来你没有检查你真正拥有的东西HTML。所以你没有做最重要的事情。

此页面上没有<a>文字Country

<input>value="Country"


这段代码对我有用

import time
from selenium import webdriver

url = 'https://www.morningstar.com/etfs/xnas/vnqi/portfolio'

driver = webdriver.Chrome()
driver.get(url)

time.sleep(2)

country = driver.find_element_by_xpath('//input[@value="Country"]')
country.click()

time.sleep(1)
next_page = driver.find_element_by_xpath('//a[@aria-label="Go to Next Page"]')
    
while True:
    
    # get data
    table_rows = driver.find_elements_by_xpath('//table[@class="sal-country-exposure__country-table"]//tr')
    for row in table_rows[1:]:  # skip header 
        elements = row.find_elements_by_xpath('.//span')  # relative xpath with `.//`
        print(elements[0].text, elements[1].text, elements[2].text)

    # check if there is next page
    disabled = next_page.get_attribute('aria-disabled')
    #print('disabled:', disabled)
    if disabled:
        break

    # go to next page        
    next_page.click()
    
    time.sleep(1)

结果

Japan 22.08 13.47
China 10.76 1.45
Australia 9.75 6.05
Hong Kong 9.52 6.04
Germany 8.84 5.77
Singapore 6.46 4.33
United Kingdom 6.22 5.77
Sweden 3.48 2.00
France 3.18 2.58
Canada 2.28 2.92
Switzerland 1.78 0.69
Belgium 1.63 1.31
Philippines 1.53 0.15
Israel 1.47 0.16
Thailand 0.98 0.09
India 0.87 0.11
South Africa 0.87 0.21
Taiwan 0.83 0.08
Mexico 0.80 0.33
Spain 0.62 0.84
Malaysia 0.54 0.08
Brazil 0.52 0.06
Austria 0.51 0.16
New Zealand 0.41 0.21
Indonesia 0.37 0.02
Norway 0.37 0.29
United States 0.29 44.09
Netherlands 0.24 0.19
Chile 0.21 0.01
Ireland 0.16 0.19
South Korea 0.15 0.00
Turkey 0.08 0.02
Russia 0.08 0.00
Finland 0.06 0.16
Poland 0.05 0.00
Greece 0.05 0.00
Italy 0.02 0.05
Argentina 0.00 0.00
Colombia 0.00 0.00
Czech Republic 0.00 0.00
Denmark 0.00 0.00
Estonia 0.00 0.00
Hungary 0.00 0.00
Latvia 0.00 0.00
Lithuania 0.00 0.00
Pakistan 0.00 0.00
Peru 0.00 0.00
Portugal 0.00 0.00
Slovakia 0.00 0.00
Venezuela 0.00 0.00
于 2021-02-06T14:16:49.120 回答