12

我的问题是,当 XML 文件中有 UTF-8 字符时,我的 DOM 解析器无法加载文件现在,我知道我必须给他指示才能读取 utf-8,但我不知道该怎么说在我的代码中是:

File xmlFile = new File(fileName);
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(xmlFile);
doc.getDocumentElement().normalize();

我知道有方法 setencoding(),但我不知道将它放在我的代码中的哪个位置...

4

3 回答 3

31

尝试这个。为我工作

        InputStream inputStream= new FileInputStream(completeFileName);
        Reader reader = new InputStreamReader(inputStream,"UTF-8");
        InputSource is = new InputSource(reader);
        is.setEncoding("UTF-8");

        DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
        Document doc = dBuilder.parse(is);
于 2014-10-09T13:55:23.510 回答
7

尝试使用 Reader 并提供编码作为参数:

InputStream inputStream = new FileInputStream(fileName);
documentBuilder.parse(new InputSource(new InputStreamReader(inputStream, "UTF-8")));
于 2013-05-06T14:35:36.137 回答
-3

我使用了 Eugene 在那里所做的并稍微改变了它。

DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();

FileInputStream in = new FileInputStream(new File("XML.xml"));
Document doc = dBuilder.parse(in, "UTF-8");

虽然这将被读取,就UTF-8好像你在 Eclipse 控制台中打印它不会显示任何 'UTF-8' 字符,除非 java 文件保存为'UTF-8',或者至少我发生了什么

于 2014-07-18T00:13:50.400 回答