0

我正在使用这样的技术将 Xml 文件目录读入 XmlDocument 对象。

private static void StripAttributes(string filePath)
    {
        Contract.Requires(filePath != null);
        var xmlDocument = new XmlDocument();
        var encode = Encoding.GetEncoding("ISO-8859-1");
        using (var sr = new StreamReader(filePath, encode))
        {
            xmlDocument.Load(sr);
        }

这可行,但是当在文本编辑器中渲染输出的 Xml 时,属性周围的单引号现在是双引号,并且子节点位于不同的行上。

之前的例子:

<xml>
  <xml2>
     <xmlField id='foo' string='bar'><xmlValue>foobar</xmlValue></xmlField>
  </xml2>
</xml>

格式化后的例子:

<xml>
  <xml2>
     <xmlField id="foo">
        <xmlValue>foobar</xmlValue>
     </xmlField>
  </xml2>
</xml>

我需要原始格式保持不变以进行比较。

关于如何保留 Xml 的原始格式的任何想法?

4

2 回答 2

0

空白

您的第一个问题是空格。在 XML 中,这通常不重要,因此默认情况下,XmlDocument它将规范化任何重要的空白,这就是您在此处看到的。

要更改此行为,PreserveWhitespace = true请在加载 XML 之前进行设置:

var xmlDocument = new XmlDocument
{
    PreserveWhitespace = true
};

引号

您的第二个问题与引号字符有关。Single 或 double 有效,但 .NET 中的默认值是 double。两个 DOM 都将使用内部重写您的 XML XmlWriter,该内部使用此默认值。当然,您可以指定自己的XmlWriter实例。

指导是使用XmlWriter.Create工厂方法并使用 指定任何功能XmlWriterSettings,但是在这种情况下这不起作用。您将必须显式创建一个实例XmlTextWriter并更改QuoteChar

var writer = new XmlTextWriter(fileName, encoding)
{
    QuoteChar = '\''
};

using (writer)
{
    xmlDocument.WriteTo(writer);
}

LINQ 转 XML

顺便说一句,我强烈建议查看 LINQ to XML 而不是旧的XmlDocumentAPI。要在 中获得类似的行为XDocument,您可以像这样解析和编写:

var doc = XDocument.Load(filePath, LoadOptions.PreserveWhitespace);
doc.WriteTo(writer);

如果正如您的代码所建议的那样,您要删除属性,那么像这样简单的代码将从具有 namestring的元素中删除具有 name 的所有属性xmlField

doc.Descendants("xmlField")
    .SelectMany(e => e.Attributes("string"))
    .Remove();
于 2015-07-09T16:35:44.823 回答
0

Possibly you cannot! With Microsoft.Net implementation of xml rendering, the renderer always reformats the output. Either XmlDocument or XDocument with any kind of setting. In one our my projects (Efatura in Turkey) the xml files are xades signed and should not be changed. We have realized that just only rendering and saving without any changes alters something in xml file and renders signature invalid. Also if for example the input xml is only one line (without any whitespace) The parsers (all) fail to parse the document. The effect we observed is the parser misses some elements acting as if they are not there.

For your situation I suggest try to use other xml implementations. For our case, since we dont need to change sth, we first keep all of the string seperately then parse a copy of document to extract information from it. When finished throw it to garbage.

For single line xml we used XmlReader but altered matching mechanism.

于 2015-07-09T16:51:14.697 回答