c++ - PugiXML C++ 获取元素（或标签）的内容

Question

好吧，我在 C++ 中使用 PugiXML 使用 Visual Studio 2010 来获取元素的内容，但问题是当它看到“<”时它停止获取值，所以它没有得到值，它只是得到即使“<”没有关闭其元素，内容也会到达“<”字符。我希望它到达结束标签，即使它忽略标签，但至少只忽略内部标签内的文本。

而且我还想知道如何获取外部 XML，例如，如果我获取元素

pugi::xpath_node_set 工具 = doc.select_nodes("/mesh/bounds/b"); 我该怎么做才能获得“ Link Till here”的全部内容

此内容与此处给出的内容相同：

#include "pugixml.hpp"

#include <iostream>
#include <conio.h>
#include <stdio.h>

using namespace std;

int main//21
    () {
    string source = "<mesh name='sphere'><bounds><b id='hey'> <a DeriveCaptionFrom='lastparam' name='testx' href='http://www.google.com'>Link Till here<b>it will stop here and ignore the rest</b> text</a></b> 0 1 1</bounds></mesh>";

    int from_string;
    from_string = 1;

    pugi::xml_document doc;
    pugi::xml_parse_result result;
    string filename = "xgconsole.xml";
    result = doc.load_buffer(source.c_str(), source.size());
    /* result = doc.load_file(filename.c_str());
    if(!result){
        cout << "File " << filename.c_str() << " couldn't be found" << endl;
        _getch();
        return 0;
    } */

        pugi::xpath_node_set tools = doc.select_nodes("/mesh/bounds/b/a[@href='http://www.google.com' and @DeriveCaptionFrom='lastparam']");

        for (pugi::xpath_node_set::const_iterator it = tools.begin(); it != tools.end(); ++it) {
            pugi::xpath_node node = *it;
            std::cout << "Attribute Href: " << node.node().attribute("href").value() << endl;
            std::cout << "Value: " << node.node().child_value() << endl;
            std::cout << "Name: " << node.node().name() << endl;

        }

    _getch();
    return 0;
}

这是输出：

Attribute Href: http://www.google.com
Value: Link Till here
Name: a

我希望我足够清楚，在此先感谢

score 7 · Accepted Answer

我的精神力量告诉我你想知道如何获取节点所有子节点的连接文本（也称为内部文本）。

最简单的方法是像这样使用 XPath：

pugi::xml_node node = doc.child("mesh").child("bounds").child("b");
string text = pugi::xpath_query(".").evaluate_string();

显然，您可以编写自己的递归函数，将子树中的 PCDATA/CDATA 值连接起来；使用内置递归遍历工具，例如 find_node，也可以工作（使用 C++11 lambda 语法）：

string text;
text.find_node([&](pugi::xml_node n) -> bool { if (n.type() == pugi::node_pcdata) result += n.value(); return false; });

现在，如果您想获取标签的全部内容（也称为外部 xml），您可以将节点输出到字符串流，即：

ostringstream oss;
node.print(oss);
string xml = oss.str();

获取内部 xml 将需要遍历节点的子节点并将其外部 xml 附加到结果中，即

ostringstream oss;
for (pugi::xml_node_iterator it = node.begin(); it != node.end(); ++it)
    it->print(oss);
string xml = oss.str();

score 2 · Accepted Answer

这就是 XML 的工作原理。你不能嵌入<或>正确地融入你的价值观。转义它们（例如使用 HTML 实体，如<and >）或定义CDATA 部分。

score 1 · Accepted Answer

我在解析子树（包括所有元素和子节点）的问题上遇到了很多困难——最简单的方法几乎就是这里显示的：

您应该使用以下代码：

ostringstream oss;
oNode.print(oss, "", format_raw);
sResponse = oss.str();

如果需要，请在每个函数之前使用 pugi:: 而不是 oNode 使用您想要的节点。

c++ - PugiXML C++ 获取元素（或标签）的内容

3 回答 3

Related

Reference