我编写了一个程序,用于在单击按钮后从网页中抓取源代码。我无法抓取正确的页面,因为我相信正在发送 AJAX 请求,并且我不等待此响应发生。我的代码目前是:
public class Htmlunitscraper {
private static String s = "http://cpdocket.cp.cuyahogacounty.us/SheriffSearch/results.aspx?q=searchType%3dSaleDate%26searchString%3d10%2f21%2f2013%26foreclosureType%3d%27NONT%27%2c+%27PAR%27%2c+%27COMM%27%2c+%27TXLN%27";
public static String scrapeWebsite() throws IOException {
java.util.logging.Logger.getLogger("com.gargoylesoftware").setLevel(Level.OFF);
System.setProperty("org.apache.commons.logging.Log", "org.apache.commons.logging.impl.NoOpLog");
final WebClient webClient = new WebClient();
final HtmlPage page = webClient.getPage(s);
final HtmlForm form = page.getForms().get(2);
final HtmlSubmitInput button = form.getInputByValue(">");
final HtmlPage page2 = button.click();
String originalHtml = page2.refresh().getWebResponse().getContentAsString();
return originalHtml;
}
}
参考此链接后,我相信要解决此问题,我可以实现方法“webClient.waitForBackgroundJavaScript(10000)”。唯一的问题是我不明白如何执行此操作,因为每次单击按钮时,我都会创建一个 HtmlPage 对象,而不是 WebClient 对象。我如何结合这种方法来解决问题?