1

我正在研究子图匹配问题(匹配分子内的化学官能团)。原始代码是由另一个学生编写的(在 Visual C++ 下,没有特定于 MS 的库),它在 Windows 上运行良好。然后我在程序中添加了新函数,但没有改变子图匹配的算法,新程序在 gcc4.2/Mac OS X 下编译得很好。但是我在运行时遇到了奇怪的问题!

此处相关的对象及其成员:

  1. Atom:包含 ID、Element、Bond 列表(指向 Bond 对象的指针向量)、search_mark (bool)。获取变量并将 search_mark 设置为 true 或 false 的函数。

  2. Bond:包含一个由 2 个指向原子 A 和 B 的指针组成的数组,以及一个在使用参数 atom* B 调用时返回 atom* A 的函数,反之亦然。

  3. 分子:包含指向原子的指针向量,以及使用原子 ID 或向量内的位置获取原子*的函数。

  4. Atom 的子类:HammettAtom。它包含的额外成员是指向相关分子原子的原子指针。

这是递归函数的算法:对于数组 A 中的每个原子,与数组 B 中的一个原子(Hammett 群,通常大小约为 10-20 个原子)进行比较。如果元素相同,则获取每个元素的连接原子列表,然后重复。测试的原子沿途被标记,因此在某一时刻将不再有未标记的连接原子。

这是代码(未更改,我只添加了 cout 位进行测试)。当函数第一次被调用时,第一个向量是来自测试分子的单个原子,第二个向量是哈米特组分子的第二个原子。(Hammett 中的第一个原子的 ID 为“X”,可以是任何东西。)

bool HammettCheck::checkSubproblem(vector<Atom*> bonded_atoms, vector<Atom*> my_list) 
{
unsigned int truth=0;
vector<Atom*> unmarked_bonded;
vector<Atom*> unmarked_list;
cout << "\n size of Hammett array: " <<my_list.size()<< " size of mol array: "<< bonded_atoms.size() << endl; //for testing
//If number of connected atoms is different, return false.
if( bonded_atoms.size() != my_list.size() ){
    return false;
}

//Create new lists.
for(unsigned int i=0; i < bonded_atoms.size() ; i++){

    //Create list of unmarked connected atoms in molecule.
    if( !bonded_atoms[i]->isMarked() ){
        unmarked_bonded.push_back(bonded_atoms[i]);
    }

    //Create list of unmarked connected atoms in hammett.
    if( !my_list[i]->isMarked() ){
        unmarked_list.push_back( my_list[i] );
    }
}
cout << "size of unmarked Hammett array: " << unmarked_list.size() << " size of unmarked mol array: "<< unmarked_bonded.size() <<endl; //for testing
//If number of unmarked connected atoms is different, return false.
if( unmarked_bonded.size() != unmarked_list.size() ){
    return false;
}


//Check each unmarked atom connected in the molecule against possible atoms it could be in the hammett group.
for(unsigned int i=0; i < unmarked_bonded.size(); i++){
  cout<< "atom in um_mol array considered ID: " << unmarked_bonded[i]->getID() << " Ele: " << unmarked_bonded[i]->getEle()<< endl;
    /*Unmarked hammett assigned in reverse order so that the undefined "X" atom is only 
      assigned if a connected atom can not possibly be any other atom.*/
    for(int j=(unmarked_list.size()-1); j > -1; j--){
      cout << "atom in um_h_array considered ID: " << unmarked_list[j]->getID() << endl;
        //If hammett atom has already been assigned to a connected atom, it cannot be assigned to another
        if(!unmarked_list[j]->isMarked()){
          cout << unmarked_list[j]->getID() << "is unmarked" <<endl;
            /*If connected atom could only be hammett group's connection 
              to the rest of the molecule, assign it as such.*/
            if( !strcmp(unmarked_list[j]->getEle().c_str(), "X") ){
                unmarked_bonded[i]->mark();
                unmarked_list[j]->mark(unmarked_bonded[i]);
                truth++;
                cout<< "mol atom ID "<< unmarked_bonded[i]->getID() <<" marked as X, current truth: "<< truth << endl;
                cout << unmarked_list[j]->getID() << "is now marked(1)/unmarked(0) " << unmarked_list[j]->isMarked() << " and break loop "<<endl;
                break;
            }

            /*If connected atom is the same element as a possible hammett atom,
              check that atoms connections by running them through the subproblem.*/
            if( !strcmp(unmarked_bonded[i]->getEle().c_str(), unmarked_list[j]->getEle().c_str()) ){
                unmarked_bonded[i]->mark();
                unmarked_list[j]->mark(unmarked_bonded[i]);
                cout<<"found same ele between mol_id "<< unmarked_bonded[i]->getID() <<" and ham_id " << unmarked_list[j]->getID() <<endl;
                vector<Atom*> new_bonded = getAttachedAtoms( unmarked_bonded[i] );
                vector<Atom*> new_list = getAttachedAtoms( unmarked_list[j] );
                if( checkSubproblem( new_bonded, new_list ) ){
                  cout<<"found same atom"<<endl;
                    truth++;
                    break;

                /*If only the elements are the same do not assign 
                  the hammett atom to this connected atom.*/
                }else{
                    unmarked_bonded[i]->demark();
                    unmarked_list[j]->demark();
                }
            }
        }
    }
}

//Return true if all connected atoms can be assigned atoms of the hammett group.
if( truth == unmarked_bonded.size() ){
    return true;
}else{
    return false;
}
}

我用 29 个原子的测试分子运行编译程序,并将其与两个 Hammett 组进行比较。它应该包含第 2 组,但不包含第 1 组。但是,当我从 2 个具有相同元素的原子开始时,它返回了 true。下面是一个输出样本(分子实际上不包含该原子的哈米特基团)

 currently at molecule atom ID 1

 size of Hammett array: 1 size of mol array: 1
 size of unmarked H array: 1 size of unmarked mol array: 1
 atom in um_mol array considered ID: 1 Ele: N
 atom in um_h_array considered ID: N1
 N1is unmarked
 found same ele between mol_id 1 and ham_id N1

 size of Hammett array: 3 size of mol array: 3
 size of unmarked H array: 3 size of unmarked mol array: 3
 atom in um_mol array considered ID: 2 Ele: H
 atom in um_h_array considered ID: O2
 O2is unmarked
 atom in um_h_array considered ID: O1
 O1is unmarked
 atom in um_h_array considered ID: X
 X is unmarked
 mol atom ID 2 marked as X, current truth: 1
 X is now marked(1)/unmarked(0) 128 and break loop 
 atom in um_mol array considered ID: 8 Ele: C
 atom in um_h_array considered ID: O2
 O2is unmarked
 atom in um_h_array considered ID: O1
 O1is unmarked
 atom in um_h_array considered ID: X
 X is unmarked
 mol atom ID 8 marked as X, current truth: 2
 X is now marked(1)/unmarked(0) 160 and break loop 
 atom in um_mol array considered ID: 17 Ele: C
 atom in um_h_array considered ID: O2
 O2is unmarked
 atom in um_h_array considered ID: O1
 O1is unmarked
 atom in um_h_array considered ID: X
 X is unmarked
 mol atom ID 17 marked as X, current truth: 3
 X is now marked(1)/unmarked(0) 128 and break loop 
 found same atom
 Hammet group 2 checkSubproblem true
 Hammett added to atom 1

对不起,如果那很长。但问题是,在我标记了“X”原子(Hammett 分子中的第一个原子)并尝试获取 search_mark 布尔值之后,它的值大于 1。因此,X 被错误地“标记”了几次,并且'truth' 计数器上升,直到达到条件 truth == unmarked_bonded.size()。

我不确定实际的问题是什么?值 128 表明存在一些混淆的内存/指针问题,但我不确定如何找出它。我什至不确定它是否与递归函数有关!

如果有人能提出一些我可以尝试的建议,我将不胜感激。提前致谢!

PS Atom 类函数的代码。

 string Atom::getID()
{
return id;
}

string Atom::getEle()
{
return ele;
}
void Atom::mark()
{
search_mark = true;
}

void Atom::demark()
{
search_mark = false;
}
void HammettAtom::mark(Atom* assigned)
{
search_mark = true;
related_mol_atom = assigned;
}
bool Atom::isMarked()
{
return search_mark;
}
4

1 回答 1

1

感谢所有的建议。在 Valgrind 下运行程序后,很明显问题与内存泄漏有关。我在确定的“绝对丢失”问题中移动了分配器的位置,并且似乎删除了 search_mark 问题。该程序现在按预期运行,但存在内存泄漏。我还没有设法解决内存泄漏问题,我在这里发布了一个新问题。

于 2012-05-11T14:30:18.293 回答