url - 扫描网站中的所有链接（URL）并在 selenium get() 方法中使用它们中的每一个

Question

我需要获取网站中的所有 URL，以便我可以使用 Selenium get() 方法打开网页。打开页面后，我打算从网页中获取少量数据并转到下一个链接。

你能帮我找到最好的方法吗，并提供一个示例代码。

score 0 · Accepted Answer

可能还有其他方法可以做到这一点，但这是我想到的第一种方法。

    // Create your driver of choice
    WebDriver fDriver;
    fDriver = new FirefoxDriver();

    // Direct the driver to the site that you want to get all the links from
    fDriver.get("Site URL here...");

    // Grab all the anchor tags on the page you're currently on.
    List<WebElement> anchors = fDriver.findElements(By.tagName("a"));

    // Create a 2nd List to hold the URLs of the anchor tags.
    List<String> allURLs = new ArrayList<String>();

    // Iterate through all the anchors that you got.
    for (WebElement a : anchors) {

        // Print out the URL of the anchor.
        System.out.println(a.getAttribute("href"));

        // Store the URL of the List.
        allURLs.add(a.getAttribute("href"));
    }

    // Now just get the URL you want to use from the list...
    String siteURL = allURLs.get(0);
    // and enter it into the get() method of the driver.
    fDriver.get(siteURL);

将所有 URL 与 get 方法一起使用的第二部分由您自己决定，但我想我已经为您提供了一个良好的开端。希望能帮助到你！

url - 扫描网站中的所有链接（URL）并在 selenium get() 方法中使用它们中的每一个

1 回答 1

Related

Reference