22

我正在尝试使用 JAXB 将 xml 文件解组为对象,但遇到了一些困难。实际项目在 xml 文件中有几千行,所以我在较小的范围内重现了错误,如下所示:

XML 文件:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<catalogue title="some catalogue title" 
           publisher="some publishing house" 
           xmlns="x-schema:TamsDataSchema.xml"/>

用于生成 JAXB 类的 XSD 文件

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
 <xsd:element name="catalogue" type="catalogueType"/>

 <xsd:complexType name="catalogueType">
  <xsd:sequence>
   <xsd:element ref="journal"  minOccurs="0" maxOccurs="unbounded"/>
  </xsd:sequence>
  <xsd:attribute name="title" type="xsd:string"/>
  <xsd:attribute name="publisher" type="xsd:string"/>
 </xsd:complexType>
</xsd:schema>

代码片段1:

final JAXBContext context = JAXBContext.newInstance(CatalogueType.class);
um = context.createUnmarshaller();
CatalogueType ct = (CatalogueType)um.unmarshal(new File("file output address"));

引发错误:

javax.xml.bind.UnmarshalException: unexpected element (uri:"x-schema:TamsDataSchema.xml", local:"catalogue"). Expected elements are <{}catalogue>
 at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallingContext.handleEvent(UnmarshallingContext.java:642)
 at com.sun.xml.bind.v2.runtime.unmarshaller.Loader.reportError(Loader.java:247)
 at com.sun.xml.bind.v2.runtime.unmarshaller.Loader.reportError(Loader.java:242)
 at com.sun.xml.bind.v2.runtime.unmarshaller.Loader.reportUnexpectedChildElement(Loader.java:116)
 at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallingContext$DefaultRootLoader.childElement(UnmarshallingContext.java:1049)
 at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallingContext._startElement(UnmarshallingContext.java:478)
 at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallingContext.startElement(UnmarshallingContext.java:459)
 at com.sun.xml.bind.v2.runtime.unmarshaller.SAXConnector.startElement(SAXConnector.java:148)
 at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(Unknown Source)
 at com.sun.org.apache.xerces.internal.parsers.AbstractXMLDocumentParser.emptyElement(Unknown Source)
 at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown Source)
 at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl$NSContentDispatcher.scanRootElementHook(Unknown Source)
 at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
 at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
 at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
 at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
 at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
    ...etc

因此,XML 文档中的命名空间会导致问题,不幸的是,如果将其删除,它可以正常工作,但由于文件是由客户端提供的,因此我们无法使用它。我尝试了多种在 XSD 中指定它的方法,但似乎没有一种排列有效。

我还尝试使用以下代码解组忽略命名空间:

Unmarshaller um = context.createUnmarshaller();
final SAXParserFactory sax = SAXParserFactory.newInstance();
sax.setNamespaceAware(false);
final XMLReader reader = sax.newSAXParser().getXMLReader();
final Source er = new SAXSource(reader, new InputSource(new FileReader("file location")));
CatalogueType ct = (CatalogueType)um.unmarshal(er);
System.out.println(ct.getPublisher());
System.out.println(ct.getTitle());

它工作正常,但无法解组元素属性和打印

null
null

由于我们无法控制的原因,我们仅限于使用 Java 1.5,而且我们正在使用 JAXB 2.0,这很不幸,因为第二个代码块使用 Java 1.6 可以按需要工作。

任何建议将不胜感激,另一种方法是在解析文件之前从文件中删除名称空间声明,这似乎不优雅。

4

4 回答 4

18

感谢您的这篇文章和您的代码片段。这绝对让我走上了正确的道路,因为我也疯狂地试图处理一些供应商提供的 XML,这些 XML 无处不xmlns="http://vendor.com/foo"在。

我的第一个解决方案(在阅读您的帖子之前)是将 XML 放入字符串中,然后xmlString.replaceAll(" xmlns=", " ylmns=");(恐怖,恐怖)。除了冒犯我的感受之外,在处理来自 InputStream 的 XML 时也很痛苦。

在查看您的代码片段后,我的第二个解决方案:(我使用的是 Java7)

// given an InputStream inputStream:
String packageName = docClass.getPackage().getName();
JAXBContext jc = JAXBContext.newInstance(packageName);
Unmarshaller u = jc.createUnmarshaller();

InputSource is = new InputSource(inputStream);
final SAXParserFactory sax = SAXParserFactory.newInstance();
sax.setNamespaceAware(false);
final XMLReader reader;
try {
    reader = sax.newSAXParser().getXMLReader();
} catch (SAXException | ParserConfigurationException e) {
    throw new RuntimeException(e);
}
SAXSource source = new SAXSource(reader, is);
@SuppressWarnings("unchecked")
JAXBElement<T> doc = (JAXBElement<T>)u.unmarshal(source);
return doc.getValue();

但是现在,我找到了我更喜欢的第三种解决方案,希望这对其他人有用:如何在模式中正确定义预期的命名空间:

<xsd:schema jxb:version="2.0"
  xmlns:xsd="http://www.w3.org/2001/XMLSchema"
  xmlns:jxb="http://java.sun.com/xml/ns/jaxb"
  xmlns="http://vendor.com/foo"
  targetNamespace="http://vendor.com/foo"
  elementFormDefault="unqualified"
  attributeFormDefault="unqualified">

有了这个,我们现在可以删除sax.setNamespaceAware(false);行 (update: 实际上,如果我们保持unmarshal(SAXSource)调用,那么我们需要sax.setNamespaceAware(true). 但更简单的方法是不打扰SAXSource它的创建代码,而是unmarshal(InputStream)默认情况下是命名空间感知的。而且 marshal() 的输出也有适当的命名空间。

是啊。只用了大约 4 个小时。

于 2011-10-07T23:08:38.450 回答
13

How to ignore the namespaces

You can use an XMLStreamReader that is non-namespace aware, it will basically trim out all namespaces from the xml file that you're parsing:

// configure the stream reader factory
XMLInputFactory xif = XMLInputFactory.newFactory();
xif.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false); // this is the magic line

// create xml stream reader using our configured factory
StreamSource source = new StreamSource(someFile);
XMLStreamReader xsr = xif.createXMLStreamReader(source);

// unmarshall, note that it's better to reuse JAXBContext, as newInstance()
// calls are pretty expensive
JAXBContext jc = JAXBContext.newInstance(your.ObjectFactory.class);
Unmarshaller unmarshaller = jc.createUnmarshaller();
Object unmarshal = unmarshaller.unmarshal(xsr);

Now the actual xml that gets fed into JAXB doesn't have any namespace info.


Important note (xjc)

If you generated java classes from an xsd schema using xjc and the schema had a namespace defined, then the generated annotations will have that namespace, so delete it manually! Otherwise JAXB won't recognize such data.

Places where the annotations should be changed:

  • ObjectFactory.java

     // change this line
     private final static QName _SomeType_QNAME = new QName("some-weird-namespace", "SomeType");
     // to something like
     private final static QName _SomeType_QNAME = new QName("", "SomeType", "");
    
     // and this annotation
     @XmlElementDecl(namespace = "some-weird-namespace", name = "SomeType")
     // to this
     @XmlElementDecl(namespace = "", name = "SomeType")
    
  • package-info.java

     // change this annotation
     @javax.xml.bind.annotation.XmlSchema(namespace = "some-weird-namespace", elementFormDefault = javax.xml.bind.annotation.XmlNsForm.QUALIFIED)
     // to something like this
     @javax.xml.bind.annotation.XmlSchema(namespace = "", elementFormDefault = javax.xml.bind.annotation.XmlNsForm.QUALIFIED)
    

Now your JAXB code will expect to see everything without any namespaces and the XMLStreamReader that we created supplies just that.

于 2016-04-13T20:14:45.773 回答
3

这是我对这个命名空间相关问题的解决方案。我们可以通过实现我们自己的 XMLFilter 和 Attribute 来欺骗 JAXB。

class MyAttr extends  AttributesImpl {

    MyAttr(Attributes atts) {
        super(atts);
    }

    @Override
    public String getLocalName(int index) {
        return super.getQName(index);
    }

}

class MyFilter extends XMLFilterImpl {

    @Override
    public void startElement(String uri, String localName, String qName, Attributes atts) throws SAXException {
        super.startElement(uri, localName, qName, new VersAttr(atts));
    }

}

public SomeObject testFromXML(InputStream input) {

    try {
        // Create the JAXBContext
        JAXBContext jc = JAXBContext.newInstance(SomeObject.class);

        // Create the XMLFilter
        XMLFilter filter = new VersFilter();

        // Set the parent XMLReader on the XMLFilter
        SAXParserFactory spf = SAXParserFactory.newInstance();
        //spf.setNamespaceAware(false);

        SAXParser sp = spf.newSAXParser();
        XMLReader xr = sp.getXMLReader();
        filter.setParent(xr);

        // Set UnmarshallerHandler as ContentHandler on XMLFilter
        Unmarshaller unmarshaller = jc.createUnmarshaller();
        UnmarshallerHandler unmarshallerHandler = unmarshaller
                .getUnmarshallerHandler();
        filter.setContentHandler(unmarshallerHandler);

        // Parse the XML
        InputSource is = new InputSource(input);
        filter.parse(is);
        return (SomeObject) unmarshallerHandler.getResult();

    }catch (Exception e) {
        logger.debug(ExceptionUtils.getFullStackTrace(e));
    }

    return null;
}
于 2014-07-23T03:56:10.317 回答
1

这篇文章中解释了这个问题的解决方法:JAXB: How to ignore namespace during unmarshalling XML document? . 它解释了如何使用 SAX 过滤器从 XML 中动态添加/删除 xmlns 条目。处理编组和解组的方式相同。

于 2010-01-27T19:28:06.310 回答