c# - c# 我的 Xpath 有问题吗？使用 Package.GetPart 从 DocX 文件中解析 Xml

Question

我有一个名为 Reader 的课程

public class Reader

这是构造函数

public Reader(string fileName)
        {
            using (Package package = Package.Open(AppDomain.CurrentDomain.BaseDirectory + "\\" + fileName + ".docx"))
            {
                Document = new XmlDocument();
                Document.Load(package.GetPart(new Uri("/word/document.xml", UriKind.Relative)).GetStream());
                xmlNamespaceManager = new XmlNamespaceManager(Document.NameTable);
                xmlNamespaceManager.AddNamespace("w", @"http://schemas.microsoft.com/office/word/2006/wordml");
            }
        }

还有一个名为 ReadTextNodes 的公共方法，我已经设置它来测试它。

public void ReadTextNodes()
        {
            var nodes = Document.SelectNodes("//w:t", xmlNamespaceManager);
            Console.WriteLine(nodes.Count);
            foreach (XmlNode node in nodes)
            {
                Console.WriteLine(node.InnerText);
            }
        }

我使用的 Xpath 是“//w:t”——我已将其链接到 Word 使用的 XML 命名空间“w”（“ http://schemas.microsoft.com/office/word/2006/wordml ”）然而，这个查询给了我零个节点。当我用 "//*" 替换时，控制台会很快填满文本。那么第一个查询有什么问题呢？

score 1 · Accepted Answer

我发现我使用了错误的架构。我将 docx 文件保存为 XML 文件并在 Visual Studio 中打开，发现“w”实际上映射到“ http://schemas.openxmlformats.org/wordprocessingml/2006/main ”

而不是“ http://schemas.microsoft.com/office/word/2006/wordml ”

c# - c# 我的 Xpath 有问题吗？使用 Package.GetPart 从 DocX 文件中解析 Xml

1 回答 1

Related

Reference