c++ - 查找向量中是否存在相同结构的最快方法

Question

假设我有一个 people 结构：

struct Person {
    char name[100];
    char surname[100];
    unsigned int age;
};

我想找出最快的方法来搜索并查找向量中是否已经存在具有相同值（相同名称、相同姓氏、相同年龄）的另一个结构。

请记住，我在一个向量中有数百万个。

谢谢

score 2 · Accepted Answer

这是一种可能性：

#include <iostream>
#include <string>
#include <vector>
#include <set>
#include <tuple>

struct Person {
    std::string name;
    std::string surname;
    unsigned int age;

    bool operator<(const Person &x) const
    {
        return std::tie(name, surname, age) < std::tie(x.name, x.surname, x.age);
    }
};


int main()
{
    std::vector<Person> v;

    // ...

    std::set<Person> s;
    for (const auto &x : v)
    {
        auto i = s.insert(x);
        if (!i.second)
        {
            // x is duplicated
        }
    }
}

对于您的评论，您可以通过以下方式对矢量进行排序：

std::sort(v.begin(), v.end()); // Operator < is overloaded

score 0 · Accepted Answer

根据问题中的评论，特别是

不，我的意思是一组描述 10 与 2、12、54 等重复，或者 2 与 10、12、54 重复

听起来你真正想要的数据结构是std::multimap（或者std::unordered_multimap如果你有 C++11 并且不关心顺序）。Multimaps 将负责您必须使用 M. M. 的解决方案自己进行的簿记（这总体上很好，除了您必须维护一个带有重复描述的附加容器）。std::multimap为您做额外的簿记。

#include <map>     // or <unordered_map>
#include <string>
#include <tuple>   // std::tie()
#include <utility> // std::make_pair()

struct Person {
  std::string name;
  std::string surname;
  unsigned int age;

  bool operator<(const Person &x) const
  {
    return std::tie(name, surname, age) < std::tie(x.name, x.surname, x.age);
  }
};

extern bool tryReadNextPersonFromFile(Person &, size_t & record_id);

int main()
{
  std::multimap<Person, size_t> persons;
  Person p;
  size_t rid;

  while(tryReadNextPersonFromFile(p, rid)) {
    persons.insert(std::make_pair(p, rid));
  }

  // ...

  p = ...
  size_t howMany = persons.count(p);
  if(0 == howMany) { /* not found ... */ }
  else {
    auto eq_range = persons.equal_range(p);
    for(auto it=eq_range.first; it != eq_range.second; ++it) {
      size_t pRecordID = it->second;
      // ...
    }
  }
}

为简洁起见，我使用了很多 C++11 语法（如auto），但这个想法同样适用于 C++03。由于您之前可能没有听说过 multimaps（或者至少不熟悉 STL 界面），因此请务必查看例如一些关于您可以使用它以及如何使用它的文档。

c++ - 查找向量中是否存在相同结构的最快方法

2 回答 2

Related

Reference