c++ - 高性能堆排序

Question

我有一个大小超过 500 万的向量，每次我想从向量中取出一个键最小的元素，并对这个元素做一些处理。然而，随着这个特定元素的处理，向量中的所有剩余元素也将受到影响，以便它们的键更新。所以下次如果我想从向量中取出键最小的元素，我必须再次对向量进行排序。问题是从向量中拾取最小元素的次数会高达50万，所以程序运行的很慢。为了您更清楚地理解，我可以编写以下代码来说明：

void function(vector<MyObj*>& A)
{ //A.size() is near 5 million, maybe even more such as 50 million.
    make_heap(A.begin(), A.end(), compare); // compare function is self-defined.
    for (int i=0; i<500000; i++)
    {
        MyObj* smallest_elem = A.front();
        pop_heap(A.begin(), A.end());
        A.pop_back();
        Process_MyObj(smallest_elem); // here all of the elements 
                                      // in A will be affect, causing 
                                      // their keys changed.

        make_heap(A.begin(), A.end()); // Since all elements' keys in A changed,
                                       // so heap sorting A once again is 
                                       // necessary in my viewpoint.
    }
}

有没有办法让代码尽可能高效地运行？欢迎任何想法，而不是算法的有限改进，例如并行或其他任何东西。太感谢了！

score 2 · Accepted Answer

If Process_MyObj is indeed affecting the keys of all the elements in A, I don't think there is much you can do. If it only modified some of the keys, you could write code to update individual elements in the heap.

As your code is now I don't see what you gain from building a heap. I would just do a linear scan to find the minimal element, swap it with the last one, and then pop the last element.

score 0 · Accepted Answer

有多少时间在Process_MyObj，有多少在堆操作中——50 / 50%，80 / 20%？
这很重要，因为您想平衡两者。考虑以下一般设置：

Make a Todo list
Loop:
    work on items ...
    update the Todo list

太多时间更新列表意味着没有足够的时间做实际工作。所以首先测量比率 Process / Heap time。
一种便宜的方法是进行第二次运行 Process_MyObj并compare完成两次，例如

 P + H = 1.0 sec
2P + H = 1.7 sec
=> P = .7, H = .3: P / H = 70 % / 30 %.

make_heap以线性时间运行——查看如何在制作最多 3n 次比较的同时实现stdmake-heap-be-implemented-因此加速将会很困难。如果值是常量，则 64 位 <32 value, 32 index> 的堆比指针的缓存效率更高。

cstheory.stack 上的 whats-new-in-purely-functional-data-structures-since- okasaki 列出了数十篇论文，主要是理论性的，但一两篇可能与您的问题相关。

真正的加速几乎总是针对特定问题的，而不是普遍的。你能告诉我们更多关于真正的问题吗？

补充：如果大多数 pop 都很小，并且推送很大，请尝试在大排序列表前面放置一个小的缓存堆。伪代码：

push:
    push( cacheheap )
pop:
    return min( cacheheap, bigsortedlist )

如果 cacheheap停留在真正的 cpu 缓存中，这可能会很有效；ymmv。
（您也许可以作弊，留下bigsortedlist不准确的信息，而不是每次都对其进行排序。）

score 0 · Accepted Answer

您可以尝试对向量进行排序并按顺序选择元素，而不是使用堆。

它不会提高 big-o 复杂度，但可能会提高常数因子。

c++ - 高性能堆排序

3 回答 3

Related

Reference