我正在编写一个自动完成程序,该程序查找给定字典文件和输入文件的字母或字符集的所有可能匹配项。我刚刚完成了一个通过迭代搜索实现二进制搜索的版本,并认为我可以提高程序的整体性能。
问题是,二分搜索几乎比迭代搜索慢 9 倍。是什么赋予了?我以为我通过使用二分搜索而不是迭代来提高性能。
运行时间(向左搜索 bin)[大]:
这是每个版本的重要部分,完整的代码可以在我的 github 上用 cmake 构建和运行。
二进制搜索函数(在给定输入循环时调用)
bool search(std::vector<std::string>& dict, std::string in,
std::queue<std::string>& out)
{
//tick makes sure the loop found at least one thing. if not then break the function
bool tick = false;
bool running = true;
while(running) {
//for each element in the input vector
//find all possible word matches and push onto the queue
int first=0, last= dict.size() -1;
while(first <= last)
{
tick = false;
int middle = (first+last)/2;
std::string sub = (dict.at(middle)).substr(0,in.length());
int comp = in.compare(sub);
//if comp returns 0(found word matching case)
if(comp == 0) {
tick = true;
out.push(dict.at(middle));
dict.erase(dict.begin() + middle);
}
//if not, take top half
else if (comp > 0)
first = middle + 1;
//else go with the lower half
else
last = middle - 1;
}
if(tick==false)
running = false;
}
return true;
}
迭代搜索(包含在主循环中):
for(int k = 0; k < input.size(); ++k) {
int len = (input.at(k)).length();
// truth false variable to end out while loop
bool found = false;
// create an iterator pointing to the first element of the dictionary
vecIter i = dictionary.begin();
// this while loop is not complete, a condition needs to be made
while(!found && i != dictionary.end()) {
// take a substring the dictionary word(the length is dependent on
// the input value) and compare
if( (*i).substr(0,len) == input.at(k) ) {
// so a word is found! push onto the queue
matchingCase.push(*i);
}
// move iterator to next element of data
++i;
}
}
示例输入文件:
z
be
int
nor
tes
terr
on