c++ - 使用 libxml2 sax 解析器获取值错误

Question

我正在尝试使用 libxml2 的 sax 接口解析 xml 文件。有时效果很好，但后来我更改了 xml 中的 2 行顺序（它当然仍然有效），并且某些值在解析后变得无效。我将 startElementNsSAX2Func 用于 startElement，它有一个参数 const xmlChar ** 属性，用于存储当前元素的属性。

在我的 startElement 方法的开头，我创建了一个简单的对象来处理属性这是该类的代码：

class XMLElementAttributes {
public:
  static const int AttributeArrayWidth = 5;
  static const int LocalNameIndex = 0;
  static const int PrefixIndex = 1;
  static const int URIIndex = 2;
  static const int ValueIndex = 3;
  static const int EndIndex = 4;

  XMLElementAttributes( int nb_attributes, const xmlChar **attributes) :
  nb_attributes(nb_attributes),
  attributes(attributes){
  }

  xmlChar* getLocalName( int index ) const {
    return (xmlChar*)attributes[ AttributeArrayWidth * index + LocalNameIndex];
  }

  xmlChar* getValue( int index ) const{
      return (xmlChar*)std::string(attributes[ AttributeArrayWidth * index + ValueIndex],attributes[ AttributeArrayWidth * index + EndIndex]).c_str(); 
  }

  int getLength() const{
    return nb_attributes;
  }

private:
  int nb_attributes;
  const xmlChar ** attributes;
};

（xmlChar 是 Typedef 无符号字符 xmlChar）

然后如果我需要存储一个属性的值，我用这个 staic 方法克隆它（我也尝试使用 libxml2 的 xmlStrdup，结果是一样的）：

xmlChar* cloneXMLString(const xmlChar* const source) {
    xmlChar* result;
    int len=0;
    std::cout<<"source"<<std::endl;
    while (source[len] != '\0'){
        std::cout<<(void*)&source[len] << ": " << source[len] <<std::endl;
        len++;
    }
    std::cout<<std::endl;
    std::cout<<"result, "<<std::endl;
    result = new xmlChar[len+1];
    for (int i=0; i<len; i++){
        result[i] = source[i];
        std::cout<<(void *)&source[i] << ": "<< source[i] << std::endl;
    }
    std::cout<<std::endl;
    result[len] = '\0';
    return result;
}

它的工作率为 99%，但有时最后的结果与源代码没有任何相似之处。这是一个示例输出（输入是 abcdef，并且 \0 终止）：

source
0x7fdb7402cde8: a
0x7fdb7402cde9: b
0x7fdb7402cdea: c
0x7fdb7402cdeb: d
0x7fdb7402cdec: e
0x7fdb7402cded: f


result, 
0x7fdb7402cde8: !
0x7fdb7402cde9: 
0x7fdb7402cdea: 
0x7fdb7402cdeb: 
0x7fdb7402cdec: x
0x7fdb7402cded:

我这样称呼它：

xmlChar* value = cloneXMLString(attributes.getValue(index));

因此，虽然源地址没有改变，但它的值却改变了。xml文件的解析继续没有问题，克隆后的下一个值再次生效。

如果 xml 文件未更改，则错误始终位于相同的元素和参数处。如果我在 xml 中更改一些东西，例如：

<somenodes a="arg1" b="arg2">
  <node c="abc" d="def" />
  <node c="ghi" d="jkl" />
</somenodes>

至

<somenodes a="arg1" b="arg2">
  <node c="ghi" d="jkl" />
  <node c="abc" d="def" />
</somenodes>

错误出现在其他地方，或者它消失并且解析工作正常。什么可能导致这种情况？

编辑：

我的开始元素方法：

void MyParser::startElement( void * ctx,
        const xmlChar * localName,
        const xmlChar * prefix,
        const xmlChar * URI,
        int nb_namespaces,
        const xmlChar ** namespaces,
        int nb_attributes,
        int nb_defaulted,
        const xmlChar ** attrs ){

    XMLElementAttributes attributes ( nb_attributes, attrs );

    switch ( state ) {
    case Somestate:
       if ( xmlStrcmp( localName, StrN("SomeName").xmlCharForm() ) == 0) {
         someVar = new SomeObject(attributes);
       } 
    break;

    ...

    }
}

StrN 从 char* 创建 xmlChar。someVar 是 MyParser 类中的静态字段（startElement 也是静态的）。在 SomeObject 的构造函数中，我尝试获取如下属性的值：

class SomeObject {
    public:
    SomeObject( XMLElementAttributes &attributes){
        for (int i=0; i< attributes.getLength(); i++) {
            xmlChar* name = attributes.getLocalName(i);
            if ( xmlStrcmp( name, StrN("somename").xmlCharForm()) == 0 ) {
                somename = cloneXMLString(attributes.getValue(i));
            }
            ...
        }
    }
};

score 0 · Accepted Answer

很明显 source 没有指向有效的内存。这可能是因为内存已被释放，也可能是因为它指向在已退出的函数中声明的堆栈内存。

这样的内存可能会以不可预知的方式被覆盖，这就是您在这里看到的。

需要查看更多上下文，特别是您如何调用cloneXMLString以及传递给此函数的内存的来源，以获得更详细的答案。

c++ - 使用 libxml2 sax 解析器获取值错误

1 回答 1

Related

Reference