我有一个错误 htmlparsing 。我认为问题源于引号 DjNative language=javascript error language="javascript" 我尝试了所有版本的 Dj 本机库
[致命错误]:2:18:与元素类型“语言”相关联的属性“{1}”需要打开引号。org.xml.sax.SAXParseException;行号:2;列号:18;与元素类型“语言”关联的属性“{1}”应为开放式引用。在 com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(未知来源) 在 com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(未知来源)
private Document HTMLtoXML(String source)
{
Document doc = null;
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder;
try {
builder = factory.newDocumentBuilder();
InputSource src = new InputSource(new StringReader(source));
doc = builder.parse(src);
} catch (ParserConfigurationException e) {
e.printStackTrace();
} catch (SAXException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
return doc;
}
public void StartTakip()
{
String htmlSource=webbrowser.getHTMLContent();
dc = HTMLtoXML(htmlSource);
}
当我尝试通过 DJNative Swing 获取页面源代码时
<HTML>
<HEAD>
<SCRIPT language=javascript src="/medula/scripts/capFirstLetters.js"></SCRIPT>
<TITLE>deneme</TITLE>
</HEAD>
<BODY bgcolor=#233333>
</BODY>
</HTML>
如果源如下所示,html 解析工作良好
<HTML>
<HEAD>
<SCRIPT language="javascript" src="/medula/scripts/capFirstLetters.js"></SCRIPT>
<TITLE>deneme</TITLE>
</HEAD>
<BODY bgcolor="#233333">
</BODY>
</HTML>