我正在尝试编写一个警报系统来定期抓取投诉委员会网站,以查找有关我的产品的任何投诉。我正在使用 Jsoup。下面是给我错误的代码片段。
doc = Jsoup.connect(finalUrl).timeout(10 * 1000).get();
这给了我错误
java.net.SocketException: Unexpected end of file from server
当我在浏览器中复制粘贴相同的 finalUrl 字符串时,它可以工作。然后我尝试了简单的 URL 连接
BufferedReader br = null;
try {
URL a = new URL(finalUrl);
URLConnection conn = a.openConnection();
// open the stream and put it into BufferedReader
br = new BufferedReader(new InputStreamReader(
conn.getInputStream()));
doc = Jsoup.parse(br.toString());
} catch (IOException e) {
e.printStackTrace();
}
但事实证明,连接本身返回 null(br 为 null)。现在的问题是,为什么在浏览器中复制粘贴时相同的字符串会打开站点而没有任何错误?
完整的堆栈跟踪如下:
java.net.SocketException: Unexpected end of file from server
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:774)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:771)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1195)
at ComplaintsBoardScraper.main(ComplaintsBoardScraper.java:46)