6

我有以下代码:

    MemoryStream ms = new MemoryStream();
    XmlWriter w = XmlWriter.Create(ms);

    w.WriteStartDocument(true);
    w.WriteStartElement("data");

    w.WriteElementString("child", "myvalue");

    w.WriteEndElement();//data
    w.Close();
    ms.Close();

    string test = UTF8Encoding.UTF8.GetString(ms.ToArray());

XML 生成正确;但是,我的问题是字符串 'test' 的第一个字符是 ï (char #239),使其对某些 xml 解析器无效:这是从哪里来的?我到底做错了什么?

我知道我可以从第一个字符之后开始解决问题,但我宁愿知道它为什么存在,而不是简单地修补问题。

谢谢!

4

4 回答 4

13

在这里找到了一种解决方案: https ://timvw.be/2007/01/08/generating-utf-8-with-systemxmlxmlwriter/

我在顶部错过了这个:

XmlWriterSettings xmlWriterSettings = new XmlWriterSettings();
xmlWriterSettings.Encoding = new UTF8Encoding(false);
MemoryStream ms = new MemoryStream();
XmlWriter w = XmlWriter.Create(ms, xmlWriterSettings);

感谢大家的帮助!

于 2009-05-14T14:01:56.787 回答
2

问题是编写器生成的 XML 是 UTF-16,而您使用 UTF-8 将其转换为字符串。试试这个:

StringBuilder sb = new StringBuilder();
using (StringWriter writer = new StringWriter(sb))
using (XmlWriter w = XmlWriter.Create(writer))
{
    w.WriteStartDocument(true);
    w.WriteStartElement("data");

    w.WriteElementString("child", "myvalue");

    w.WriteEndElement();//data
}

string test = sb.ToString();
于 2009-05-14T13:56:40.370 回答
0

您可以像这样更改编码:

w.Settings.Encoding = Encoding.UTF8;
于 2009-05-14T13:54:39.113 回答
0

如果您关心编辑器使用的字节顺序标记(例如 Visual Studio 检测 UTF8 编码的 XML 和正确突出显示语法),所有这些都略有偏差。

这是一个解决方案:

MemoryStream stream = new MemoryStream();

XmlWriterSettings settings = new XmlWriterSettings();
settings.Encoding = Encoding.UTF8;
settings.Indent = true;
settings.IndentChars = "\t";

using (XmlWriter writer = XmlWriter.Create(stream, settings))
{
    // ... write

    // Make sure you flush or you only get half the text
    writer.Flush();

    // Use a StreamReader to get the byte order correct
    StreamReader reader = new StreamReader(stream,Encoding.UTF8,true);
    stream.Seek(0, SeekOrigin.Begin);
    result = reader.ReadToEnd();
}

这里有 2 个完整的片段

于 2009-06-02T15:44:24.443 回答