xml - 使用 XmlSlurper 解析 UTF-8 xml 文件

Question

我正在尝试用 XmlSlurper 解析谷歌原子。我的用例是这样的。

1) 将 atom xml 发送到带有 rest 客户端的服务器。

2）处理请求并在服务器端解析它。

我使用 Groovy 开发服务器并使用 XmlSlurper 作为解析器。但我无法成功并得到“prolog 中不允许的内容”异常。然后我试图找出它发生的原因。我将 atom xml 保存到使用 utf-8 编码的文件中。然后尝试读取文件并解析原子，我得到了同样的异常。但后来我将 atom xml 保存到一个文件 whixh 用 ansi 编码。我成功地解析了atom xml。所以我认为问题在于 XmlSlurper 和“UTF-8”。

你对这个限制有什么想法吗？我的 atom xml 必须是 utf-8，那么我该如何解析这个 atom xml 呢？谢谢你的帮助。

XML：

<?xml version="1.0" encoding="UTF-8"?>
<entry xmlns:atom='http://www.w3.org/2005/Atom'
    xmlns:gd='http://schemas.google.com/g/2005'>
  <category scheme='http://schemas.google.com/g/2005#kind'
    term='http://schemas.google.com/contact/2008#contact' />
  <title type='text'>Elizabeth Bennet</title>
  <content type='text'>Notes</content>
  <gd:email rel='http://schemas.google.com/g/2005#work'
    address='liz@gmail.com' />
  <gd:email rel='http://schemas.google.com/g/2005#home'
    address='liz@example.org' />
  <gd:phoneNumber rel='http://schemas.google.com/g/2005#work'
    primary='true'>
    (206)555-1212
  </gd:phoneNumber>
  <gd:phoneNumber rel='http://schemas.google.com/g/2005#home'>
    (206)555-1213
  </gd:phoneNumber>
  <gd:im address='liz@gmail.com'
    protocol='http://schemas.google.com/g/2005#GOOGLE_TALK'
    rel='http://schemas.google.com/g/2005#home' />
  <gd:postalAddress rel='http://schemas.google.com/g/2005#work'
    primary='true'>
    1600 Amphitheatre Pkwy Mountain View
  </gd:postalAddress>
</entry>

读取文件并解析：

 String file = "C:\\Documents and Settings\\user\\Desktop\\create.xml";
 String line = "";
 StringBuilder sb = new StringBuilder();
 BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(file)));
 while ((line = br.readLine()) !=null) {
     sb.append(line);
 }
 System.out.println("sb.toString() = " + sb.toString());

 def xmlf = new XmlSlurper().parseText(sb.toString())
    .declareNamespace(gContact:'http://schemas.google.com/contact/2008',
        gd:'http://schemas.google.com/g/2005')

   println xmlf.title

score 3 · Accepted Answer

尝试：

String file = "C:\\Documents and Settings\\user\\Desktop\\create.xml"

def xmlf = new XmlSlurper().parse( new File( file ) ).declareNamespace( 
        gContact:'http://schemas.google.com/contact/2008',
        gd:'http://schemas.google.com/g/2005' )
println xmlf.title

你要走很长的路

score 1 · Accepted Answer

这就是问题：

BufferedReader br = new BufferedReader(
    new InputStreamReader(new FileInputStream(file)));
while ((line = br.readLine()) !=null) {
    sb.append(line);
}

那就是使用平台默认编码读取文件。如果编码错误，您将错误地读取数据。

您应该做的是让 XML 解析器为您处理它。它应该能够根据第一行数据检测编码本身。

我不熟悉，XmlSlurper但我希望它能够解析输入流（在这种情况下只需给它FileInputStream）或处理文件本身的名称。

xml - 使用 XmlSlurper 解析 UTF-8 xml 文件

2 回答 2

Related

Reference