2

I have the following code:

WebClient webClient = new WebClient();
HtmlPage page = webClient.getPage("http://www.myland.co.il/%D7%9E%D7%97%D7%A9%D7%91-%D7%94%D7%A9%D7%A7%D7%99%D7%94");

The code fails with com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException: 404 Not Found for http://www.myland.co.il/Scripts/swfobject_modified.js

I do see in the console output the HTML page I am interested in. Is there a way to supress the exception and get an Html page after all? The page does load correctly in a real browser.

4

1 回答 1

8

Yes, you can use setThrowExceptionOnFailingStatusCode to ignore failing status codes, something like;

WebClient webClient = new WebClient();
webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);
HtmlPage page = webClient.getPage("http://www.myland.co.il/%D7%9E%D7%97%D7%A9%D7%91-%D7%94%D7%A9%D7%A7%D7%99%D7%94");

The default is normally true, which gives the error you're describing.

EDIT: Just in case you're running an old version, with versions of HtmlUnit earlier than 2.11, setThrowExceptionOnFailingStatusCode can be called on the WebClient itself instead of the options returned by getOptions(). In 2.11 or later, you should use getOptions() as above.

于 2013-02-09T08:39:57.283 回答