c++ - C++ 排序类比 qsort 快

Question

我有一堂课

class Zaposlenik { 
private:
    string prezime; 
    string funkcija; 
    double placa; 
public:
    bool operator==(const string& prezime) const; 
    bool operator<(const string &prezime) const; 
    bool operator<(const Zaposlenik &other) const;

我使用带字符串的运算符进行二分搜索，使用带 Zaposlenik 的运算符 < 进行排序

我无法更改标头类我只能在 .cpp 中编写代码。

我也有

class Firma { 
private: 
vector<Zaposlenik> zaposlenici; 
public: 
void sort();

我也不能改变那个类，我必须为它写.cpp。我将 2 .cpp 上传到自动评分服务器，该服务器将 500 000 Zaposlenik 输入向量 zaposlenici，然后执行 2 000 000 次搜索。

我使用了 qsort 和 bsearch，它太慢了。上传的时候不能超过3s。

我已经编写了重载运算符，我相信它们很好，但显然 qsort 可以更快。

向量按字符串前缀排序，名称从“aaaa”到“ZZZZ”，因此是大小字母的 4 个字母组合。

string funkcija;并且double placa; 对排序没有任何意义。

有人能告诉我哪种排序会比 qsort 更快吗？请记住，我对 main 没有任何控制权，并且在制作成员时我无法计算成员。

PS 类中还有其他功能，但它们对此部分没有任何意义。Bsearch也有功能，但我相信它的速度很快。

score 9 · Accepted Answer

三件事：

使用std::sort代替std::qsort，它更快，因为它可以内联对比较运算符的调用（如果您在标题中定义它或启用链接时优化）。
覆盖swap您的类，以便可以有效地交换它，而不是通过临时变量进行复制。但是，这需要更改标头（因为您需要访问私有变量）。
由于您的排序字符串的长度固定为 4，因此不同的排序算法将是有益的。相当容易实现的经典选择是基数排序。从您的一些评论来看，您的教授似乎希望您实现这一点。

score 3 · Accepted Answer

自动评分服务器将 500 000 Zaposlenik 输入向量 zaposlenici，然后执行 2 000 000 次搜索。我使用了 qsort 和 bsearch，它太慢了。

澄清一下，你没有在每次调用 bsearch 之前调用 qsort，对吧？因为只有当列表已经排序时，二分查找才会很快。如果您在每次搜索之前对列表进行排序，您将获得糟糕的性能。

鉴于您概述的约束（所有字符串都是四个字符长），我刚刚测试std::sort了自定义桶排序，并且在一百万个元素上，桶排序快了 8 倍。提示：四个字符的字符串可以编码为 4 * 6 = 24 位，因此您需要 16.777.216 个桶进行计数。

score 2 · Accepted Answer

有几件事情需要考虑。首先，您不能使用qsorton std::vector<Zaposlenik>。 qsort用于memcpy在交换时复制对象，并且 memcpy仅适用于具有琐碎复制的对象。在这种情况下使用qsort 可能会更快，因为它不会正确复制对象。您必须使用std::sort; 别的都行不通。

做到这一点：速度（或至少你可以影响的一方）std::sort取决于两件事：比较的速度和交换的速度。你没有展示什么 bool Zaposlenik::operator<( Zaposlenik const& other ) const;，所以我们只能猜测。如果它做的不仅仅是return prezime, other.prezime )，那么你应该编写一个单独的比较函数，并std::sort用它调用。另一方面是交换：std::sort最终使用std::swap，其默认实现将类似于：

template <typename T>
void
std::sort( T& lhs, T& rhs )
{
    T tmp( lhs );
    lhs = rhs;
    rhs = lhs;
}

对于许多课程，这涉及大量额外的复制；例如 std::string，这将对字符串进行三个深拷贝，这可能涉及动态分配和释放内存。 std::string但是，有一个成员函数swap；在许多情况下，它可以仅通过交换指针来实现交换，而不是通过进行深度复制。这可能会导致显着的加速。但是，您的课程Zaposlenik没有做任何优化std::swap，因此您获得了深层副本。您应该提供一个成员函数swap：

void Zaposlenik::swap( Zaposlenik& other )
{
    swap( prezime, other.prezime );
    swap( funkcija, other.funkcija );
    swap( placa, other.placa );
}

此功能将使用系统优化的 std::string. 为了确保std::sort使用它，您还应该提供一个重载的自由函数swap（在与相同的命名空间中Zaposlenik），它调用您的成员函数：

void
swap( Zaposlenik& lhs, Zaposlenik& rhs )
{
    lhs.swap( rhs );
}

这样做的原因：std::sort调用了一个自由函数swap。

score 1 · Accepted Answer

The issue here really is your restriction of the class header. I suspect the bottleneck is either the swapping operation while sorting or the lexical string comparison (or possibly both). If you cannot change the class definition at all, it's going to be tricky to improve that, since you would have to add a lot of helper code in your implementation and everything gets more complicated than it has to be.

Anyhow, here is the approach I would suggest: Since you are sorting based on strings, implement yourself a specialised version of a Trie, you cannot beat the performance of a Trie when sorting sequences lexicographically. You can implement this data structure entirely in your .cpp file and instantiate it in your Firma::sort method.

As you seem to be focussing on speed, you are probably willing to make a trade-off with regard to memory consumption. So, you implement each Node in your Treap as either an std::vector<std::shared_ptr<Trie>> which you initialise to a length of 256 (with all slots initialised to nullptr) or an std::array<std::shared_ptr<Trie>,256>. You now basically insert each of your strings into the data structure and then read them all out again. This approach is linear in the total size of all strings combined (and thus optimal).

Side note: Note that the 256 slot table at each node incurs a constant factor of 256 when traversing the Trie (i.e. when writing the Firma::zaposlenici member). If you are dealing with ASCII you can reduce the table size to 128 or split individual bytes into half-bytes, thereby incurring an overhead of 2*16 instead of 256.

Edit: If you know that you will only encounter characters from a..z and A..Z then you have a base alphabet size of 2*26 = 52 instead of 256. So your lookup table in each node of the Trie only has to be of size 52 (that is, each node can have at most 52 child nodes).

c++ - C++ 排序类比 qsort 快

4 回答 4

Related

Reference