我正在尝试翻录网站的 html 页面源以获取电子邮件。当我运行 ripper/dumper 或任何你想调用它的东西时,它会获取所有源代码但在第 160 行停止但我可以手动转到网页>右键单击>单击查看页面源然后解析文本。整个源代码只有 200 多行。手动转到每个页面并右键单击的唯一问题是有超过 100k 个页面并且需要一段时间。
这是我用来获取页面源的代码:
public static void main(String[] args) throws IOException, InterruptedException {
URL url = new URL("http://www.runelocus.com/forums/member.php?102786-wapetdxzdk&tab=aboutme#aboutme");
URLConnection connection = url.openConnection();
connection.setDoInput(true);
InputStream inStream = connection.getInputStream();
BufferedReader input = new BufferedReader(new InputStreamReader(
inStream));
String html = "";
String line = "";
while ((line = input.readLine()) != null)
html += line;
System.out.println(html);
}