我正在编写一个应用程序,它接收页面的 HTML 代码并提取页面的某些元素(例如表格)并返回这些元素的 html 代码。我试图在 java 中使用 Mozilla 解析器来简化页面导航,但我无法提取所需的 html 代码。
也许我的整个方法是错误的,也就是 Mozilla 解析器,所以如果有更好的解决方案,我愿意接受建议
String html = ///what ever the code is
MozillaParser p = // instantiate parser
// pass in html to parse which creates a dom object
Document d = p.parse(html);
// get a list of all the form elements in the page
NodeList l = d.getElementsByTagName("form");
// iterate through all forms
for(int i = 0; i < l.getLength(); i++){
// get a form
Node n = l.item(i);
// print out the html code for just this form.
// This is the portion I haven't figured out.
// I just made up the innerHTML method, but thats
// the end result I'm desiring, a way to just see
// the html code for a particular node
System.out.println( n.innerHTML() );
}