2

I need to be able to pretty print xml strings using Java APIs and have found multiple solutions for this both on the web and on this particular website. However despite multiple attempts to get this to work with javax.xml.transform.Transformer it's been a failure so far. The code I provide below works only partially when the xml string in the argument does not contain any newlines between xml elements. This just wont do. I need to be able to pretty print anything, assuming it is well formed and valid xml, even previously pretty printed strings.

I got this (put together from code snippets I found, people claimed it worked for them):

import java.io.*;
import javax.xml.transform.*;
import javax.xml.transform.stream.*;

public class XMLFormatter {

    public static String format(String xml, int indent, boolean omitXmlDeclaration)
            throws TransformerException {

        if (indent < 0) {
            throw new IllegalArgumentException();
        }
        String ret = null;
        StringReader reader = new StringReader(xml);
        StringWriter writer = new StringWriter();
        try {
            TransformerFactory factory = TransformerFactory.newInstance();
            factory.setAttribute("indent-number", new Integer(indent));
            Transformer transformer = factory.newTransformer();
            if (omitXmlDeclaration) {
                transformer.setOutputProperty(
                        OutputKeys.OMIT_XML_DECLARATION, "yes");
            }
            transformer.setOutputProperty(OutputKeys.INDENT, "yes");
            transformer.setOutputProperty(
                    "{http://xml.apache.org/xslt}indent-amount",
                    String.valueOf(indent));
            transformer.setOutputProperty(OutputKeys.METHOD, "xml");
            transformer.transform(
                    new StreamSource(reader),
                    new StreamResult(writer));
            ret = writer.toString();
        } catch (TransformerException ex) {
            throw ex;
        } finally {
            if (reader != null) {
                reader.close();
            }
            try {
                if (writer != null) {
                    writer.close();
                }
            } catch (IOException ex) {}
        }

        return ret;
    }

    public static void main(String[] args) throws TransformerException {
        StringBuilder sb = new StringBuilder();
        sb.append("<rpc-reply><data><smth/></data></rpc-reply>");

        System.out.println(sb.toString());
        System.out.println();
        System.out.println(XMLFormatter.format(sb.toString(), 4, false));

        final String NEWLINE = System.getProperty("line.separator");
        sb.setLength(0);
        sb.append("<rpc-reply>");sb.append(NEWLINE);
        sb.append("<data>");sb.append(NEWLINE);
        sb.append("<smth/>");sb.append(NEWLINE);
        sb.append("</data>");sb.append(NEWLINE);
        sb.append("</rpc-reply>");

        System.out.println(sb.toString());
        System.out.println();
        System.out.println(XMLFormatter.format(sb.toString(), 4, false));
    }
}

This code should not be bothered by those newlines, should it? Is this a bug or am I missing something vital here? The output for the code snippet:

<rpc-reply><data><smth/></data></rpc-reply>

<?xml version="1.0" encoding="UTF-8"?>
<rpc-reply>
    <data>
        <smth/>
    </data>
</rpc-reply>

<rpc-reply>
<data>
<smth/>
</data>
</rpc-reply>

<?xml version="1.0" encoding="UTF-8"?>
<rpc-reply>
<data>
<smth/>
</data>
</rpc-reply>

As far as I can tell my code only differs from other examples in that I use StringWriter and StringReader for the transform(in, out) method. I've already tried converting the xml to a ByteArrayOutputStream and even parsing it with DOM and then feeding it to transformer but the result is the same. I would really appreciate to know why this only works for single line strings.

I'm using jdk1.6_u24 combined with Netbeans 6.9.1.

This question is related to (and probably to a multitude of others) but not the same as:

How to pretty print XML from Java?

indent XML text with Transformer

Indent XML made with Transformer

4

1 回答 1

1

我已经得出结论,这是 Transformer 的正常行为。更。它的缩进功能并不意味着用作漂亮的打印机,无论如何都不是它自己的。当 XML 打印得很漂亮时,它的结构会发生变化,除非您确切知道文档应该是什么样子(基于它的 XSD、DTD 或类似的东西)。这是确定哪些换行符被视为可忽略的空格以及哪些是实际元素值或它们的一部分的唯一方法。Transformer 不会重新格式化现有的空格,这就是为什么我的代码输出是这样的。

因此,如果您想使用 Transformer 或任何其他类漂亮地打印一个已经漂亮打印的 XML 字符串,您首先必须摆脱可忽略的空格,唯一安全的方法是知道您的 XML 文档的结构应该是什么样的. 我希望有人为我确认此声明,因为目前这只是我的假设。如果这个说法是正确的;第三方漂亮的打印机是如何做到的?我知道 JTidy 不需要 XSD,但无论如何都打印得很漂亮。它是否只是将所有空格视为可忽略的空格,除非它包含在文本 XML 节点中?是否有其他方法可以确定和消除可忽略的空白?

于 2011-08-17T07:09:56.517 回答