0

我正在用 Java 编写一个 RESTFUL Web 服务。这个想法是“削减”一个 XML 文档,去掉所有不需要的内容(~98%),只留下我们感兴趣的标签,同时保持文档的结构,如下所示(我无法提供出于保密原因的实际 XML 内容):

<sear:SEGMENTS xmlns="http://www.exlibrisgroup.com/xsd/primo/primo_nm_bib" xmlns:sear="http://www.exlibrisgroup.com/xsd/jaguar/search">
   <sear:JAGROOT>
      <sear:RESULT>
         <sear:DOCSET IS_LOCAL="true" TOTAL_TIME="176" LASTHIT="9" FIRSTHIT="0" TOTALHITS="262" HIT_TIME="11">
            <sear:DOC SEARCH_ENGINE_TYPE="Local Search Engine" SEARCH_ENGINE="Local Search Engine" NO="1" RANK="0.086826384" ID="2347460">
               [
               <PrimoNMBib>
                  <record>
                     <display>
                        <title></title>
                     </display>
                     <sort>
                        <author></author>
                     </sort>
                  </record>
               </PrimoNMBib>
               ]
            </sear:DOC>
         </sear:DOCSET>
      </sear:RESULT>
   </sear:JAGROOT>
</sear:SEGMENTS>

当然,这只是我们感兴趣的标签的结构——还有数百个标签,但它们无关紧要。

方括号 ([]) 不是 XML 的一部分,它表示该元素是子列表的元素并且出现多次 - 每次匹配来自 RESTFUL 服务的搜索。

话虽如此,我的包含 XSLT 样式表的 Java 代码如下:

    import java.io.StringReader;
    import java.io.StringWriter;

    import javax.xml.transform.Transformer;
    import javax.xml.transform.TransformerException;
    import javax.xml.transform.TransformerFactory;
    import javax.xml.transform.TransformerFactoryConfigurationError;
    import javax.xml.transform.stream.StreamResult;
    import javax.xml.transform.stream.StreamSource;

    public String cutXML() throws TransformerFactoryConfigurationError, TransformerException
    {

       String xmlSourceResource = this.xml; // where this.xml is the full XML string of structure as presented above

       String xsltResource =
       "<xsl:stylesheet version=\"1.0\" xmlns:xsl=\"http://www.w3.org/1999/XSL/Transform\" xmlns:sear=\"http://www.exlibrisgroup.com/xsd/jaguar/search\">" +

       "    <xsl:output method=\"xml\" version=\"1.0\" omit-xml-declaration=\"no\" encoding=\"UTF-8\" indent=\"yes\"/>" +
       "    <xsl:strip-space elements=\"*\"/>" +

       "    <sear:WhiteList>" +
       "        <name>title</name>" +
       "        <name>author</name>" +                
       "    </sear:WhiteList>" +

       "    <xsl:template match=\"node()|@*\">" +
       "        <xsl:copy>" +
       "            <xsl:apply-templates select=\"node()|@*\"/>" +
       "        </xsl:copy>" +
       "    </xsl:template>" +

       "    <xsl:template match=\"*[not(descendant-or-self::*[name()=document('')/*/sear:WhiteList/*])]\"/>" +

       "</xsl:stylesheet>";

       StringWriter xmlResultResource = new StringWriter(); // where the transformed/stripped-down XML will be written

       Transformer xmlTransformer = TransformerFactory.newInstance().newTransformer(new StreamSource(new StringReader(xsltResource))); // create transformer object with XSLT given

       xmlTransformer.transform(new StreamSource(new StringReader(xmlSourceResource)), new StreamResult(xmlResultResource)); // transform XML with transformer and write into result StringWriter

       return xmlResultResource.getBuffer().toString(); // return transformed XML string

    }

不幸的是,当我在服务器上运行它时,我得到的只是一个带有空源的空页面,好像转换的结果是一个空字符串。

服务器的日志文件首先给出了以下信息:

    [#|2012-04-26T18:26:24.967+0000|INFO|glassfish3.1.2|com.sun.jersey.api.core.PackagesResourceConfig|_ThreadID=23;_ThreadName=Thread-2;|Scanning for root resource and provider classes in the packages: dk.kb.mobileservice|#]

    [#|2012-04-26T18:26:24.969+0000|INFO|glassfish3.1.2|com.sun.jersey.api.core.ScanningResourceConfig|_ThreadID=23;_ThreadName=Thread-2;|Root resource classes found: class dk.kb.mobileservice.Middle|#]

    [#|2012-04-26T18:26:24.970+0000|INFO|glassfish3.1.2|com.sun.jersey.api.core.ScanningResourceConfig|_ThreadID=23;_ThreadName=Thread-2;|No provider classes found.|#]

    [#|2012-04-26T18:26:24.978+0000|INFO|glassfish3.1.2|com.sun.jersey.server.impl.application.WebApplicationImpl|_ThreadID=23;_ThreadName=Thread-2;|Initiating Jersey application, version 'Jersey: 1.11 12/09/2011 10:27 AM'|#]

    [#|2012-04-26T18:26:25.192+0000|INFO|glassfish3.1.2|javax.enterprise.system.container.web.com.sun.enterprise.web|_ThreadID=23;_ThreadName=Thread-2;|WEB0671: Loading application [kb2] at [/kb2]|#]

    [#|2012-04-26T18:26:25.200+0000|INFO|glassfish3.1.2|javax.enterprise.system.tools.admin.org.glassfish.deployment.admin|_ThreadID=23;_ThreadName=Thread-2;|kb2 was successfully deployed in 2,293 milliseconds.|#]

    [#|2012-04-26T18:26:46.263+0000|SEVERE|glassfish3.1.2|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=20;_ThreadName=Thread-2;|SystemId Unknown; Line #0; Column #0; java.lang.NullPointerException |#]

    [#|2012-04-26T18:31:09.772+0000|SEVERE|glassfish3.1.2|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=21;_ThreadName=Thread-2;|SystemId Unknown; Line #0; Column #0; java.lang.NullPointerException |#]

现在它返回以下问题:

    [#|2012-04-27T00:05:07.731+0000|SEVERE|glassfish3.1.2|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=21;_ThreadName=Thread-2;|Error on line 1 column 1 of file:/root/webglassfish3/glassfish/domains/domain1/config/: SXXP0003: Error reported by XML parser: Content is not allowed in prolog.|#]

    [#|2012-04-27T00:05:07.732+0000|SEVERE|glassfish3.1.2|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=21;_ThreadName=Thread-2;|Recoverable error on line 1 SXXP0003: org.xml.sax.SAXParseException: Content is not allowed in prolog.|#]

我已经测试了 XML 文件并通过浏览器对其进行了转换,并且它工作正常,所以我认为这不是 XML 也不是 XSLT 样式表的错……这似乎是一个 Java 问题。

当我在 GlassFish 之外的整个 XML 上运行上述 Java 代码时,我收到以下错误:

    Exception in thread "main" java.lang.VerifyError: (class: GregorSamsa$0, method: test signature:         (IIIILcom/sun/org/apache/xalan/internal/xsltc/runtime/AbstractTranslet;Lcom/sun/org/apache/xml/internal/dtm/DTMAxisIterator;)Z) Incompatible type for getting or setting field
        at GregorSamsa.applyTemplates()
        at GregorSamsa.applyTemplates()
        at GregorSamsa.transform()
        at         com.sun.org.apache.xalan.internal.xsltc.runtime.AbstractTranslet.transform(AbstractTranslet.java:609)
        at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:729)
        at         com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:340)
        at XML2JSON.cutXML(XML2JSON.java:105)
        at XML2JSON.main(XML2JSON.java:31)
4

1 回答 1

0

Content is not allowed in prolog.通常意味着您在 XML 开始之前有内容。XML 解析器希望看到 XML 声明: <?xml version="1.0"?>,或者如果省略,则只是文档元素的开头(即<sear:SEGMENTS>

打印/记录this.xmlXML 声明或文档元素之前的内容并验证没有前导空白字符或其他内容。

于 2012-04-27T00:50:58.503 回答