2

我有一个非常适用于 ASCII 的 XML 序列化程序,但是当遇到非 ASCII 字符时,它们会被替换为问号“?”。我相信我已经为 UTF8 正确配置了它,但我不确定它为什么这样做。

XmlSerializer xmls = new XmlSerializer(typeof(T));
using (MemoryStream ms = new MemoryStream())
{
    XmlWriterSettings settings = new XmlWriterSettings();
    XmlSerializerNamespaces ns = new XmlSerializerNamespaces();
    ns.Add("", "");

    settings.Encoding = Encoding.UTF8;
    settings.Indent = true;
    settings.NewLineChars = "\n";
    settings.NewLineHandling = NewLineHandling.None;
    settings.NewLineOnAttributes = false;
    settings.ConformanceLevel = ConformanceLevel.Document;
    settings.OmitXmlDeclaration = true;

    using (XmlWriter writer = XmlTextWriter.Create(ms, settings))
    {
        xmls.Serialize(writer, obj, ns);
    }

    string xml = Encoding.UTF8.GetString(ms.ToArray());

    // remove the BOM character at the beginning which screws up decoding
    if (xml.Length > 0 && xml[0] != '<')
    {
        xml = xml.Substring(1, xml.Length - 1);
    }

    return xml;
}
4

1 回答 1

5

这里看起来一切都很好;用

public class Foo
{
    public string Bar { get; set; }
}
...
string xml = Test(new Foo { Bar = "Jalapeño" });

输出:

<Foo>
  <Bar>Jalapeño</Bar>
</Foo>

作为一个小改动,我完全删除了“删除 BOM 字符”代码,并在编码中明确地做到了这一点:

settings.Encoding = new UTF8Encoding(false);

此外,如果我包含 xml 声明以检查它认为它正在使用的编码:

<?xml version="1.0" encoding="utf-8"?>
<Foo>
  <Bar>Jalapeño</Bar>
</Foo>

所以基本上......无法复制。

于 2012-10-27T21:28:44.317 回答