我正在 servlets Tomcat 8.0 的容器中执行一个 Web 应用程序。在请求中,我尝试使用下面的代码将输入数据转换为 XML。第一个输入数据字符是一个unicode补充字符U+16980,表示为字符对\ud81a\udd80,第二个字符是另一个补充字符U+16990,表示为字符对\ud81a\udd90。
String text = " � �";
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
Document document = documentBuilder.newDocument();
Element root = document.createElement("root");
document.appendChild(root);
Element node = document.createElement("sofa");
node.appendChild(document.createTextNode(text));
root.appendChild(node);
Source xmlSource = new DOMSource(document);
// create StreamResult for transformation result
javax.xml.transform.Result result = new StreamResult(new FileOutputStream("text.xml"));
// create TransformerFactory
TransformerFactory transformerFactory = TransformerFactory.newInstance();
// create Transformer for transformation
Transformer transformer = transformerFactory.newTransformer();
// transform and deliver content to client
transformer.transform(xmlSource, result);
我期待:<root><sofa>𖦀 𖦐 � �</sofa></root>
但相反,我得到:<root><sofa>�� �� � �</sofa>
</root>