0

我想和大家分享如何检索被 ajax 更改的 html 页面的内容。

以下代码返回旧页面。

public class Test {

public static void main(String[] args) throws FailingHttpStatusCodeException, MalformedURLException, IOException, InterruptedException {
    String url = "valid html page";
    WebClient client = new WebClient(BrowserVersion.FIREFOX_17);
    client.getOptions().setJavaScriptEnabled(true);
    client.getOptions().setRedirectEnabled(true);
    client.getOptions().setThrowExceptionOnScriptError(true);
    client.getOptions().setCssEnabled(true);
    client.getOptions().setUseInsecureSSL(true);
    client.getOptions().setThrowExceptionOnFailingStatusCode(false);
            client.setAjaxController(new NicelyResynchronizingAjaxController());
    HtmlPage page = client.getPage(url);
    System.out.println(page.getWebResponse().getContentAsString());
}

}

这里发生了什么?

4

1 回答 1

1

答案是 page.getWebResponse() 授予初始页面。

为了获得更新的内容,我们必须使用页面变量本身

package utils;

import java.io.IOException;
import java.net.MalformedURLException;

import com.gargoylesoftware.htmlunit.BrowserVersion;
import com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException;
import com.gargoylesoftware.htmlunit.NicelyResynchronizingAjaxController;
import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.HtmlPage;

public class Test {

public static void main(String[] args) throws FailingHttpStatusCodeException, MalformedURLException, IOException, InterruptedException {
    String url = "valid html page";
    WebClient client = new WebClient(BrowserVersion.FIREFOX_17);
    client.getOptions().setJavaScriptEnabled(true);
    client.getOptions().setRedirectEnabled(true);
    client.getOptions().setThrowExceptionOnScriptError(true);
    client.getOptions().setCssEnabled(true);
    client.getOptions().setUseInsecureSSL(true);
    client.getOptions().setThrowExceptionOnFailingStatusCode(false);
    client.setAjaxController(new NicelyResynchronizingAjaxController());
    HtmlPage page = client.getPage(url);
    System.out.println(page.asXml());
    System.out.println(page.getWebResponse().getContentAsString());
}

}

我在以下链接中找到了提示

http://htmlunit.10904.n7.nabble.com/Not-expected-result-code-from-htmlunit-td28275.html

Ahmed Ashour yahoo.com> 写道:嗨,您不应该使用 WebResponse,它旨在从服务器获取实际内容。您应该使用 htmlPage.asText() 或 .asXml()Yours,Ahmed

于 2013-05-19T12:13:22.660 回答