我正在研究子图匹配问题(匹配分子内的化学官能团)。原始代码是由另一个学生编写的(在 Visual C++ 下,没有特定于 MS 的库),它在 Windows 上运行良好。然后我在程序中添加了新函数,但没有改变子图匹配的算法,新程序在 gcc4.2/Mac OS X 下编译得很好。但是我在运行时遇到了奇怪的问题!
此处相关的对象及其成员:
Atom:包含 ID、Element、Bond 列表(指向 Bond 对象的指针向量)、search_mark (bool)。获取变量并将 search_mark 设置为 true 或 false 的函数。
Bond:包含一个由 2 个指向原子 A 和 B 的指针组成的数组,以及一个在使用参数 atom* B 调用时返回 atom* A 的函数,反之亦然。
分子:包含指向原子的指针向量,以及使用原子 ID 或向量内的位置获取原子*的函数。
Atom 的子类:HammettAtom。它包含的额外成员是指向相关分子原子的原子指针。
这是递归函数的算法:对于数组 A 中的每个原子,与数组 B 中的一个原子(Hammett 群,通常大小约为 10-20 个原子)进行比较。如果元素相同,则获取每个元素的连接原子列表,然后重复。测试的原子沿途被标记,因此在某一时刻将不再有未标记的连接原子。
这是代码(未更改,我只添加了 cout 位进行测试)。当函数第一次被调用时,第一个向量是来自测试分子的单个原子,第二个向量是哈米特组分子的第二个原子。(Hammett 中的第一个原子的 ID 为“X”,可以是任何东西。)
bool HammettCheck::checkSubproblem(vector<Atom*> bonded_atoms, vector<Atom*> my_list)
{
unsigned int truth=0;
vector<Atom*> unmarked_bonded;
vector<Atom*> unmarked_list;
cout << "\n size of Hammett array: " <<my_list.size()<< " size of mol array: "<< bonded_atoms.size() << endl; //for testing
//If number of connected atoms is different, return false.
if( bonded_atoms.size() != my_list.size() ){
return false;
}
//Create new lists.
for(unsigned int i=0; i < bonded_atoms.size() ; i++){
//Create list of unmarked connected atoms in molecule.
if( !bonded_atoms[i]->isMarked() ){
unmarked_bonded.push_back(bonded_atoms[i]);
}
//Create list of unmarked connected atoms in hammett.
if( !my_list[i]->isMarked() ){
unmarked_list.push_back( my_list[i] );
}
}
cout << "size of unmarked Hammett array: " << unmarked_list.size() << " size of unmarked mol array: "<< unmarked_bonded.size() <<endl; //for testing
//If number of unmarked connected atoms is different, return false.
if( unmarked_bonded.size() != unmarked_list.size() ){
return false;
}
//Check each unmarked atom connected in the molecule against possible atoms it could be in the hammett group.
for(unsigned int i=0; i < unmarked_bonded.size(); i++){
cout<< "atom in um_mol array considered ID: " << unmarked_bonded[i]->getID() << " Ele: " << unmarked_bonded[i]->getEle()<< endl;
/*Unmarked hammett assigned in reverse order so that the undefined "X" atom is only
assigned if a connected atom can not possibly be any other atom.*/
for(int j=(unmarked_list.size()-1); j > -1; j--){
cout << "atom in um_h_array considered ID: " << unmarked_list[j]->getID() << endl;
//If hammett atom has already been assigned to a connected atom, it cannot be assigned to another
if(!unmarked_list[j]->isMarked()){
cout << unmarked_list[j]->getID() << "is unmarked" <<endl;
/*If connected atom could only be hammett group's connection
to the rest of the molecule, assign it as such.*/
if( !strcmp(unmarked_list[j]->getEle().c_str(), "X") ){
unmarked_bonded[i]->mark();
unmarked_list[j]->mark(unmarked_bonded[i]);
truth++;
cout<< "mol atom ID "<< unmarked_bonded[i]->getID() <<" marked as X, current truth: "<< truth << endl;
cout << unmarked_list[j]->getID() << "is now marked(1)/unmarked(0) " << unmarked_list[j]->isMarked() << " and break loop "<<endl;
break;
}
/*If connected atom is the same element as a possible hammett atom,
check that atoms connections by running them through the subproblem.*/
if( !strcmp(unmarked_bonded[i]->getEle().c_str(), unmarked_list[j]->getEle().c_str()) ){
unmarked_bonded[i]->mark();
unmarked_list[j]->mark(unmarked_bonded[i]);
cout<<"found same ele between mol_id "<< unmarked_bonded[i]->getID() <<" and ham_id " << unmarked_list[j]->getID() <<endl;
vector<Atom*> new_bonded = getAttachedAtoms( unmarked_bonded[i] );
vector<Atom*> new_list = getAttachedAtoms( unmarked_list[j] );
if( checkSubproblem( new_bonded, new_list ) ){
cout<<"found same atom"<<endl;
truth++;
break;
/*If only the elements are the same do not assign
the hammett atom to this connected atom.*/
}else{
unmarked_bonded[i]->demark();
unmarked_list[j]->demark();
}
}
}
}
}
//Return true if all connected atoms can be assigned atoms of the hammett group.
if( truth == unmarked_bonded.size() ){
return true;
}else{
return false;
}
}
我用 29 个原子的测试分子运行编译程序,并将其与两个 Hammett 组进行比较。它应该包含第 2 组,但不包含第 1 组。但是,当我从 2 个具有相同元素的原子开始时,它返回了 true。下面是一个输出样本(分子实际上不包含该原子的哈米特基团)
currently at molecule atom ID 1
size of Hammett array: 1 size of mol array: 1
size of unmarked H array: 1 size of unmarked mol array: 1
atom in um_mol array considered ID: 1 Ele: N
atom in um_h_array considered ID: N1
N1is unmarked
found same ele between mol_id 1 and ham_id N1
size of Hammett array: 3 size of mol array: 3
size of unmarked H array: 3 size of unmarked mol array: 3
atom in um_mol array considered ID: 2 Ele: H
atom in um_h_array considered ID: O2
O2is unmarked
atom in um_h_array considered ID: O1
O1is unmarked
atom in um_h_array considered ID: X
X is unmarked
mol atom ID 2 marked as X, current truth: 1
X is now marked(1)/unmarked(0) 128 and break loop
atom in um_mol array considered ID: 8 Ele: C
atom in um_h_array considered ID: O2
O2is unmarked
atom in um_h_array considered ID: O1
O1is unmarked
atom in um_h_array considered ID: X
X is unmarked
mol atom ID 8 marked as X, current truth: 2
X is now marked(1)/unmarked(0) 160 and break loop
atom in um_mol array considered ID: 17 Ele: C
atom in um_h_array considered ID: O2
O2is unmarked
atom in um_h_array considered ID: O1
O1is unmarked
atom in um_h_array considered ID: X
X is unmarked
mol atom ID 17 marked as X, current truth: 3
X is now marked(1)/unmarked(0) 128 and break loop
found same atom
Hammet group 2 checkSubproblem true
Hammett added to atom 1
对不起,如果那很长。但问题是,在我标记了“X”原子(Hammett 分子中的第一个原子)并尝试获取 search_mark 布尔值之后,它的值大于 1。因此,X 被错误地“标记”了几次,并且'truth' 计数器上升,直到达到条件 truth == unmarked_bonded.size()。
我不确定实际的问题是什么?值 128 表明存在一些混淆的内存/指针问题,但我不确定如何找出它。我什至不确定它是否与递归函数有关!
如果有人能提出一些我可以尝试的建议,我将不胜感激。提前致谢!
PS Atom 类函数的代码。
string Atom::getID()
{
return id;
}
string Atom::getEle()
{
return ele;
}
void Atom::mark()
{
search_mark = true;
}
void Atom::demark()
{
search_mark = false;
}
void HammettAtom::mark(Atom* assigned)
{
search_mark = true;
related_mol_atom = assigned;
}
bool Atom::isMarked()
{
return search_mark;
}