c++ - 搜索多个向量

Question

我现在有 4 个向量，每个向量大约有 45,000 条记录。寻找一种有效的方法来遍历这 4 个向量并输出与用户输入匹配的次数。数据需要在每个向量的相同索引上匹配。

多个for循环？矢量找到？

谢谢！

score 1 · Accepted Answer

如果元素需要在同一位置匹配，似乎 a std::find()orstd::find_if()结合检查该位置的其他向量是一种合理的方法：

std::vector<A> a(...);
std::vector<B> b(...);
std::vector<C> c(...);
std::vector<D> d(...);

std::size_t match(0);
for (auto it = a.begin(), end = a.end(); it != end; ) {
    it = std::find_if(it, end, conditionA));
    if (it != end) {
        if (conditionB[it - a.begin()]
         && conditionC[it - a.begin()]
         && conditionD[it - a.begin()]) {
            ++match;
        }
        ++it;
    }
}

score 1 · Accepted Answer

我从描述中得到的是，你有 4 个向量和大量用户数据，你需要找出它与同一索引处的向量匹配的次数，所以这里是代码（我正在编写一个 c++4.3.2代码）

#include<iostream>
#include<vector>
#include<algorithm>
using namespace std;

int main(){
    vector<typeT>a;  
    vector<typeT>b;
    vector<typeT>c;
    vector<typeT>d;
    vector<typeT>matched;
    /*i am assuming you have initialized a,b,c and d;
    now we are going to do pre-calculation for matching user data and store
    that in vector matched */
    int minsize=min(a.size(),b.size(),c.size(),d.size());
    for(int i=0;i<minsize;i++)
    {
        if(a[i]==b[i]&&b[i]==c[i]&&c[i]==d[i])matched.push_back(a[i]);
    }
    return 0;
}

这是预先计算的部分。现在接下来取决于您使用的数据类型，使用带有一点额外计数的二进制搜索或使用更好的数据结构来存储一对（值，重复），然后应用二进制搜索。时间复杂度为 O(n+n*log(n)+m*log(n)) 其中 nminsize在代码中，m 是用户输入的数量

score 0 · Accepted Answer

std::unordered_multimap您可以使用in为向量创建查找表O(n)。然后，您可以使用unordered_multimap::count()来获取项目在向量中出现的次数，并unordered_multimap::equal_range()获取向量内项目的索引。

std::vector<std::string> a = {"ab", "ba", "ca", "ab", "bc", "ba"};
std::vector<std::string> b = {"fg", "fg", "ba", "eg", "gf", "ge"};
std::vector<std::string> c = {"pq", "qa", "ba", "fg", "de", "gf"};

std::unordered_multimap<std::string,int> lookup_table;

for (int i = 0; i < a.size(); i++) {
    lookup_table.insert(std::make_pair(a[i], i));
    lookup_table.insert(std::make_pair(b[i], i));
    lookup_table.insert(std::make_pair(c[i], i));
}

// count
std::string userinput;
std::cin >> userinput;
int count = lookup_table.count(userinput);
std::cout << userinput << " shows up " << count << " times" << std::endl;

// print all the places where the key shows up
auto range = lookup_table.equal_range(userinput);
for (auto it = range.first; it != range.second; it++) {
    int ind = it->second;
    std::cout << " " << it->second << " " 
        << a[ind] << " " 
        << b[ind] << " " 
        << c[ind] << std::endl;
}

如果您要在查找表中搜索许多项目，这将是最有效的。如果您只需要搜索一次，那么 Dietmar Kühl 的方法将是最有效的。

score 0 · Accepted Answer

老实说，我将有几种方法来维护您的数据库（向量）。

本质上，首先做一个快速排序。

然后经常一致地运行插入排序（对于部分排序的列表，比 QuickSort 更快）

然后只需对这些向量运行二进制搜索。

编辑：

我认为存储它的更好方法是每个条目使用多个向量。拥有一个存储所有值的类向量。（您当前的向量）

class entry { 

public: 
      variable data1;
      variable data2;
      variable data3;
      variable data4;
}

把它变成一个单一的向量。然后使用我上面描述的方法对这些向量进行排序。

您必须首先按什么类型的数据进行排序。然后在对该数据调用二进制搜索之后。

c++ - 搜索多个向量

4 回答 4

编辑：

Related

Reference