我正在使用 XMLStreamReader 并解析以下 XML:
<root>
<element>
<attribute>level0</attribute>
<element>
<attribute>level1</attribute>
<element>
<attribute>level2</attribute>
</element>
</element>
</element>
</root>
我正在构建我的 XMLStreamReader:
XMLStreamReader reader = XMLInputFactory.newInstance().createXMLStreamReader(
new ByteArrayInputStream(document.getBytes()));
不幸的是,当我使用 到达第一个结束元素标记时reader.next();
,出现以下异常:
javax.xml.stream.XMLStreamException: ParseError at [row,col]:[7,14]
Message: XML document structures must start and end within the same entity.
有没有办法覆盖 XMLStreamReader 的默认行为来解决这个问题?
编辑
这是我正在使用的代码:
@Override
protected void map(LongWritable key, Text value, Mapper<LongWritable, Text, Text, Text>.Context context)
throws IOException, InterruptedException {
String document = value.toString();
System.out.println("'" + document + "'");
try {
XMLStreamReader reader = XMLInputFactory.newInstance().createXMLStreamReader(
new ByteArrayInputStream(document.getBytes()));
String propertyName = "";
String propertyValue = "";
String currentElement = "";
while (reader.hasNext()) {
int code = reader.next();
switch (code) {
case START_ELEMENT:
currentElement = reader.getLocalName();
break;
case CHARACTERS:
if (currentElement.equalsIgnoreCase("element")) {
propertyName += reader.getText();
} else if (currentElement.equalsIgnoreCase("attribute")) {
propertyValue += reader.getText();
}
break;
}
}
reader.close();
context.write(new Text(propertyName.trim()), new Text(propertyValue.trim()));
} catch (Exception e) {
e.printStackTrace();
}
}