我使用 web-harvest 5 个月,并尝试使用以下语法获取 web 的内容:
<var-def name="raw">
<html-to-xml outputtype="pretty" usecdata="false">
<http url="${URL.toString()}" />
</html-to-xml>
</var-def>
我得到了内容,但最近我收到了这个错误:
ERROR - IO error during HTTP execution for URL: http://google.com
org.webharvest.exception.HttpException: IO error during HTTP execution for URL: http://google.com
at org.webharvest.runtime.web.HttpClientManager.execute(Unknown Source)
at org.webharvest.runtime.processors.HttpProcessor.execute(Unknown Source)
at org.webharvest.runtime.processors.BaseProcessor.run(Unknown Source)
at org.webharvest.runtime.processors.BodyProcessor.execute(Unknown Source)
at org.webharvest.runtime.processors.BaseProcessor.getBodyTextContent(Unknown Source)
at org.webharvest.runtime.processors.BaseProcessor.getBodyTextContent(Unknown Source)
at org.webharvest.runtime.processors.BaseProcessor.getBodyTextContent(Unknown Source)
at org.webharvest.runtime.processors.HtmlToXmlProcessor.execute(Unknown Source)
at org.webharvest.runtime.processors.BaseProcessor.run(Unknown Source)
at org.webharvest.runtime.processors.BodyProcessor.execute(Unknown Source)
at org.webharvest.runtime.processors.VarDefProcessor.execute(Unknown Source)
at org.webharvest.runtime.processors.BaseProcessor.run(Unknown Source)
at org.webharvest.runtime.Scraper.execute(Unknown Source)
at org.webharvest.runtime.Scraper.execute(Unknown Source)
at org.webharvest.gui.ScraperExecutionThread.run(Unknown Source)
Caused by: java.net.ConnectException: Connection refused: connect
at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)
at java.net.AbstractPlainSocketImpl.connect(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.apache.commons.httpclient.protocol.ReflectionSocketFactory.createSocket(ReflectionSocke tFactory.java:140)
at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:125)
at org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
... 15 more
我把它绑在另一台电脑上,它工作正常,但在我的电脑上我得到了这个错误。