我正在使用 Java 6。我有一个 XML 模板,它的开头是这样的
<?xml version="1.0" encoding="UTF-8"?>
但是,我注意到当我使用以下代码(使用 Apache Commons-io 2.4)解析和输出它时……</p>
Document doc = null;
InputStream in = this.getClass().getClassLoader().getResourceAsStream(“my-template.xml”);
try
{
byte[] data = org.apache.commons.io.IOUtils.toByteArray( in );
InputSource src = new InputSource(new StringReader(new String(data)));
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
doc = builder.parse(src);
}
finally
{
in.close();
}
第一行输出为
<?xml version="1.0" encoding="UTF-16”?>
解析/输出文件时我需要做什么才能使标头编码保持“UTF-8”?
编辑:根据给出的建议,我将代码更改为
Document doc = null;
InputStream in = this.getClass().getClassLoader().getResourceAsStream(name);
try
{
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
doc = builder.parse(in);
}
finally
{
in.close();
}
但是尽管我的输入元素模板文件的第一行是
<?xml version="1.0" encoding="UTF-8"?>
当我将文档输出为它产生的字符串时
<?xml version="1.0" encoding="UTF-16"?>
作为第一行。这是我用来将“doc”对象输出为字符串的方法...
private String getDocumentString(Document doc)
{
DOMImplementationLS domImplementation = (DOMImplementationLS)doc.getImplementation();
LSSerializer lsSerializer = domImplementation.createLSSerializer();
return lsSerializer.writeToString(doc);
}