c# - 如何检查xml文件的文本节点内部是否有子节点并从中获取所有数据？

Question

我有 xml 文件。这只是该文件的一部分

1    <mainTerm>
2      <title> Abandonment </title>
3      <see> Maltreatment </see>
4    </mainTerm>
5    <mainTerm>
6      <title> Abasia <nemod>(-astasia) (hysterical) </nemod></title>
7      <code>F44.4</code>
8    </mainTerm>

我有很多，<mainTerm>我遍历所有这些。我复制了元素的所有数据，但是当我到达第 6 行时，我遇到了问题。如何复制所有这些内容？我需要得到最后的字符串，它看起来像“Abasia（-astasia）（歇斯底里）”。

那是我的应用程序中与该文件一起使用的部分

     List<string> nodes = new List<string>();

            //Create XmlReaderSettings object
            XmlReaderSettings settings = new XmlReaderSettings();
            settings.IgnoreWhitespace = true;
            settings.IgnoreComments = true;

            //Create XmlReader object
            XmlReader xmlIn = XmlReader.Create(path, settings);

            Excel.Application xlApp;
            Excel.Workbook xlWorkBook;
            Excel.Worksheet xlWorkSheet;
            object misValue = System.Reflection.Missing.Value;

            xlApp = new Excel.Application();
            xlWorkBook = xlApp.Workbooks.Add(misValue);
if (xmlIn.ReadToDescendant("mainTerm"))
{
 do
 {
   xmlIn.ReadStartElement("mainTerm");                                                  

   nodes.Add(xmlIn.ReadElementContentAsString());                          

   nodes.Add(xmlIn.ReadElementContentAsString());                          

 } while (xmlIn.ReadToNextSibling("mainTerm"));
}

score 1 · Accepted Answer

您可以使用 LINQ2XML。只需将您的 xml 结构包装在根节点中并获取所有标题元素，如下所示：

var xmlSrc = @"<?xml version=""1.0"" encoding=""UTF-8""?><xml><mainTerm>
  <title> Abandonment </title>
  <see> Maltreatment </see>
</mainTerm>
<mainTerm>
  <title> Abasia <nemod>(-astasia) (hysterical) </nemod></title>
  <code>F44.4</code>
</mainTerm></xml>";

var xml = XDocument.Parse(xmlSrc);
var mainTerms = xml.Root.Elements("mainTerm").ToList();
var titles = mainTerms.Elements("title").ToList();
foreach (var title in titles)
{
    System.Console.WriteLine(title.Value);
}

输出是：

Abandonment 
Abasia (-astasia) (hysterical)

恕我直言，这比 XPath 和 XmlReader 容易得多。

使用Descendants函数，您的mainTerm元素不需要是根元素：

var mainTerms = xml.Root.Descendants("mainTerm").ToList();

这一行从 XML 文档中提供了任何级别的所有mainTerm！

score 0 · Accepted Answer

您需要使用 XmlDocument。我发现使用它比 xmlreader 容易得多。用你的例子...

1    <mainTerm>
2      <title> Abandonment </title>
3      <see> Maltreatment </see>
4    </mainTerm>
5    <mainTerm>
6      <title> Abasia <nemod>(-astasia) (hysterical) </nemod></title>
7      <code>F44.4</code>
8    </mainTerm>

做这样的事情（对不起，没有时间通过编译器运行它......但这是一般的想法）

XmlDocument example = new XmlDocument();

example.Load(StringWhereYouHaveYourXML); //You can also load Xml in other ways... check documentation

XmlNodeList mainTerm = example.SelectNodes("/mainTerm/"); //Now you have your list of Mainterms...
//Check for children in your nodes now...
forEach(XmlNode a in mainTerm){
   //check in the Title node for children that is
   XmlNode title = a.SelectSingleNode("/Title/");
   if (title.HasChildNodes){
      XmlNode abasia = title.SelectSingleNode("/nemod/");
   }
}

从那里你可以用节点做你想做的事。但是您现在已经更具体地定义了它。

score 0 · Accepted Answer

阅读有关 xpath 的信息。

xpath 检查节点是否有文本

xpath 获取节点文本

XmlReader xpath

c# - 如何检查xml文件的文本节点内部是否有子节点并从中获取所有数据？

3 回答 3

Related

Reference