c# - 如何使用 XPath 搜索 XML 文件，使用 C# 返回找到的节点的行号和列号？

Question

我想用 C# 搜索大约一百个人类可读的缩进 XML 文件，使用 XPath 表达式，它应该返回每个找到的节点的文件名和行位置（以及列位置会很好），匹配 XPath .

我的目标是：将所有这些文件反序列化为一个对象树，其中包含几十个类，甚至程序中的更多实例。它的类有时是历史悠久的，需要进化/清理/重构。在更改之前我想检查更改的部分是否在任何地方使用，因为这些文件正在与不同的客户一起使用。稍后我将使用 XSLT 自动更改 XML 文件，同时改进代码。

奢侈的想法是构建一个用于在文件中搜索的 VS2010 插件——就像普通的“在文件中查找”对话框一样，但使用 XPath 作为搜索模式。

我发现 - 到目前为止 - 这里没有合适的东西，就像我的问题相反：如何从 C# 中的行号和列号中找到 XML 节点？但作为输入，会有 XPath 表达式，而不是节点。

score 4 · Accepted Answer

我以这种方式解决了它（以下代码的摘要）：

将每个文件读入一个实例XPathDocument
使用方法CreateNavigator
使用Select带有正则 XPath 表达式的方法
处理每个找到的节点：
1. 通过以下方式递归评估此文档的 xml 节点树中的绝对位置
  - XPathNavigator.MoveToPrevious()
  - XPathNavigator.MoveToParent()
  - 直到不再有前任/父母
  - 结果是一个可以写为“绝对”XPath 的列表，例如//*[1]/*[10]/*[1]/*[3]，它标识与 found 相同的节点。
2. 再次读取整个 XML 文件，但使用 XML 阅读器（使用http://msdn.microsoft.com/en-us/library/system.xml.ixmllineinfo%28v=vs.110%29的示例代码）
3. 调整该代码，当前节点的“绝对”位置是已知的，并将其与找到的“绝对”位置进行比较，直到找到
4. 如果找到，则 XmlReader 的当前节点包含最终需要的信息：
  - XML文件中搜索节点的行号和列号:-)

这是快速编写的代码示例（方法搜索是入口点）：

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml.XPath;
using System.Xml;
using System.IO;
using System.Diagnostics;

namespace XmlFileSearchMVP
{

    public class XmlFileSearchModel : IXmlFileSearchModel
    {
        public XmlFileSearchModel()
        {

        }

        /// <summary>
        /// Search in all files (fitting the wildcard pattern in <paramref name="filter"/>) from the directory
        /// <paramref name="path"/> for the XPath and report each found position in a string in the return value with
        /// the informations:
        /// + absolute file name path
        /// + number of found occurrence
        /// + line number
        /// + column number
        /// </summary>
        /// <param name="path">file system path, containing the XML files, that have to be searched</param>
        /// <param name="filter">file name with wildcards, e. g. "*.xml" or "*.mscx" or "*.gpx"</param>
        /// <param name="xPath">XPath expression (currently only Xml Node resulting XPaths ok)</param>
        /// <returns></returns>
        public IList<string> Search(string path, string filter, string xPath)
        {

            string[] files = System.IO.Directory.GetFiles(path, filter);

            var result = new List<string>();

            for (int i = 0; i < files.Length; i++)
            {
                XPathDocument xpd = new XPathDocument(files[i]);
                var xpn = xpd.CreateNavigator();
                var xpni = xpn.Select(xPath);

                int foundCounter = 0;
                while (xpni.MoveNext())
                {
                    foundCounter++;
                    var xn = xpni.Current;

                    var xnc = xn.Clone();

                    List<int> positions = new List<int>();
                    GetPositions(xn, ref positions);
                    string absXPath = GetAbsoluteXPath(positions);

                    // ok if xPath is looking for an element
                    var xpn2 = xpn.SelectSingleNode(absXPath);
                    bool samePosition = xnc.IsSamePosition(xpn2);

                    int y = -1;
                    int x = -1;
                    bool gotIt = GotFilePosition(files[i], positions, ref y, ref x);

                    result.Add(string.Format("{0} No. {1}: {2} {3} line {4}, col {5}", files[i], foundCounter, absXPath, gotIt, y, x));

                }
                result.Add(files[i] + " " + foundCounter.ToString());
            }



            return result;
        }

        /// <summary>
        /// Evaluates the absolute position of the current node.
        /// </summary>
        /// <param name="node"></param>
        /// <param name="positions">Lists the number of node in the according level, including root, that is first element. Positions start at 1.</param>
        private static void GetPositions(XPathNavigator node, ref List<int> positions)
        {
            int pos = 1;

            while (node.MoveToPrevious())
            {
                pos++;
            }

            if (node.MoveToParent())
            {
                positions.Insert(0, pos);
                GetPositions(node, ref positions);
            }
        }

        private static string GetAbsoluteXPath(List<int> positions)
        {
            StringBuilder sb = new StringBuilder("/", positions.Count * 5 + 1); // at least required...

            foreach (var pos in positions)
            {
                sb.AppendFormat("/*[{0}]", pos);
            }

            return sb.ToString();

        }


        /// <summary>
        /// base code from
        /// http://msdn.microsoft.com/en-us/library/system.xml.ixmllineinfo%28v=vs.110%29
        /// </summary>
        /// <param name="xmlFile"></param>
        /// <param name="positions"></param>
        /// <param name="line"></param>
        /// <param name="column"></param>
        /// <returns></returns>
        public static bool GotFilePosition(string xmlFile, List<int> positions, ref int line, ref int column)
        {

            // Create the XmlNamespaceManager.
            XmlNamespaceManager nsmgr = new XmlNamespaceManager(new NameTable());

            // Create the XmlParserContext.
            XmlParserContext context = new XmlParserContext(null, nsmgr, null, XmlSpace.None);

            // Create the reader.
            using (FileStream fs = new FileStream(xmlFile, FileMode.Open, FileAccess.Read))
            {
                List<int> currPos = new List<int>();
                XmlValidatingReader reader = new XmlValidatingReader(fs, XmlNodeType.Element, context);

                try
                {
                    IXmlLineInfo lineInfo = ((IXmlLineInfo)reader);
                    if (lineInfo.HasLineInfo())
                    {

                        // Parse the XML and display each node.
                        while (reader.Read())
                        {

                            switch (reader.NodeType)
                            {
                                case XmlNodeType.Document:
                                case XmlNodeType.Element:
                                    Trace.Write(string.Format("{0} {1},{2}  ", reader.Depth, lineInfo.LineNumber, lineInfo.LinePosition));

                                    if (currPos.Count <= reader.Depth)
                                    {
                                        currPos.Add(1);
                                    }
                                    else
                                    {
                                        currPos[reader.Depth]++;
                                    }
                                    Trace.WriteLine(string.Format("<{0}> {1}", reader.Name, GetAbsoluteXPath(currPos)));

                                    if (HasFound(currPos, positions))
                                    {
                                        line = lineInfo.LineNumber;
                                        column = lineInfo.LinePosition;
                                        return true;
                                    }
                                    break;

                                case XmlNodeType.Text:
                                    Trace.Write(string.Format("{0} {1},{2}  ", reader.Depth, lineInfo.LineNumber, lineInfo.LinePosition));
                                    Trace.WriteLine(string.Format("{0} {1}", reader.Value, GetAbsoluteXPath(currPos)));
                                    break;

                                case XmlNodeType.EndElement:
                                    Trace.Write(string.Format("{0} {1},{2}  ", reader.Depth, lineInfo.LineNumber, lineInfo.LinePosition));
                                    while (reader.Depth < currPos.Count - 1)
                                    {
                                        currPos.RemoveAt(reader.Depth + 1); // currPos.Count - 1 would work too.
                                    }
                                    Trace.WriteLine(string.Format("</{0}> {1}", reader.Name, GetAbsoluteXPath(currPos)));
                                    break;

                                case XmlNodeType.Whitespace:
                                case XmlNodeType.XmlDeclaration: // 1st read in XML document - hopefully
                                    break;
                                case XmlNodeType.Attribute:

                                case XmlNodeType.CDATA:

                                case XmlNodeType.Comment:

                                case XmlNodeType.DocumentFragment:

                                case XmlNodeType.DocumentType:

                                case XmlNodeType.EndEntity:

                                case XmlNodeType.Entity:

                                case XmlNodeType.EntityReference:

                                case XmlNodeType.None:

                                case XmlNodeType.Notation:

                                case XmlNodeType.ProcessingInstruction:

                                case XmlNodeType.SignificantWhitespace:
                                    break;

                            }


                        }

                    }

                }
                finally
                {
                    reader.Close();
                }
                // Close the reader.

            }
            return false;
        }

        private static bool HasFound(List<int> currPos, List<int> positions)
        {
            if (currPos.Count < positions.Count)
            {
                return false; // tree is not yet so deep traversed, like the target node
            }

            for (int i = 0; i < positions.Count; i++)
            {
                if (currPos[i] != positions[i])
                {
                    return false;
                }
            }
            return true;
        }


    }
}

score 0 · Accepted Answer

此示例应该可以帮助您入门：

using (StringReader stream = new StringReader(xml))
{
    XPathDocument document = new XPathDocument(stream);
    XPathNavigator navigator = document.CreateNavigator();

    XPathExpression expression = navigator.Compile("/path/to/element");
    XPathNodeIterator iterator = navigator.Select(expression);

    try
    {
        while (iterator.MoveNext())
        {
            XPathNavigator current = iterator.Current;
            string elementValue = current.Value;

            IXmlLineInfo lineInfo = current as IXmlLineInfo;
            if (lineInfo != null)
            {
                int lineNumber = lineInfo.LineNumber;
                int linePosition = lineInfo.LinePosition;
            }
        }
    }
    catch
    {
    }
}

c# - 如何使用 XPath 搜索 XML 文件，使用 C# 返回找到的节点的行号和列号？

2 回答 2

Related

Reference