algorithm - 这种优化对 Heapsort 有多大价值？

Question

传统Heapsort算法在每个之后将堆的最后一个元素与当前堆的根交换heapification，然后再次继续该过程。但是，我注意到这是不必要的。

在heapification子数组的 a 之后，虽然节点包含最高值（如果它是 a max-heap），但数组中接下来的 2 个元素必须跟随排序数组中的根，或者按照与现在相同的顺序，或者交换它们如果它们是反向排序的。因此，与其仅将根与最后一个元素交换，不如将前 3 个元素（包括节点以及在必要时交换第 2 和第 3 个元素之后）与最后 3 个元素交换，这样 2随后heapifications的（对于第 2 和第 3 个元素）被省去了？

这种方法有什么缺点吗（除了如果需要交换第二个和第三个元素，这应该是微不足道的）？如果不是，如果它确实更好，它会带来多少性能提升？这是伪代码：

function heapify(a,i) {
  #assumes node i has two nodes, both heaps. However node i itself might not be a heap, i.e one of its children may be greater than its value.
  #if so, find the greater of its two children, then swp the parent with that value.
  #this might render that child as no longer a heap, so recurse
 }

function create_heap(a) {
   #all elements following index |_n/2_| are leaf nodes, thus heapify() should be applied to all elements within index 1 to |_n/2_|
   }

function heapsort(a) {
  create_heap(a);  #a is now a max-heap
  #root of the heap, that is a[1] is the maximum, so swap it with a[n].
  #The root now contains an element smaller than its children, both of which are themselves heaps.
  #so apply heapify(a,1). Note: heap length is now n-1, as a[n] is the largest element already found
  #now again the array is a heap. The highest value is in a[1]. Swap it with a[n-1].
  #continue
  }

假设数组是[4,1,3,2,16,9,10,14,8,7]。运行一个之后heapify，就会变成[16,14,10,8,7,9,3,2,4]。现在的heapsort第一次迭代将交换 16 和 4，导致[4,14,10,8,7,9,3,2,16]. 由于这现在已经将新堆的根渲染[4,14,10,8,7,9,3,2]为，嗯，未堆，（14 和 10 都大于 4），运行另一个heapify来生成[14,8,10,4,7,9,3,2]. 现在 14 是根，将其与 2 交换为 yield [2,8,10,4,7,9,3,14]，从而使数组 current [2,8,10,4,7,9,3,14,16]。我们再次发现 2 未堆，因此再次执行 aheapify使堆为[10,8,9,4,7,2,3]。然后将 10 与 3 交换，使数组为[3,8,9,4,7,2,3,10,14,16]. 我的观点是heapification，我们可以从第一个开始，而不是在 16 之前存储 10 和 14，而不是执行 2nd 和 3rd sheapification因为 10 和 14 在 16 之后，所以它们是第二和第三大元素（反之亦然）。因此，在它们之间进行比较之后（如果它们已经排序，则 14 在 10 之前），我将那里的所有内容(16,14,10)与交换(3,2,4)，使数组[3,2,4,8,7,9,16,14,10]. 这将我们减少到与另外两个heapifications 之后的情况类似的情况——[3,8,9,4,7,2,3,10,14,16]最初，与[3,2,4,8,7,9,16,14,10]现在相比。两者现在都需要进一步heapification，但第二种方法让我们直接通过两个元素（14 和 10）之间的比较来到达这个关口。

score 10 · Accepted Answer

堆的第二大元素出现在第二或第三位置，但第三大元素可以出现在更下方，深度 2。（参见http://en.wikipedia.org/wiki/Heap_(data_structure ））。此外，在将前三个元素与后三个元素交换之后，heapify 方法将首先堆化根的第一个子树，然后是根的第二个子树，然后是整棵树。因此，此操作的总成本接近于将顶部元素与最后一个元素交换并调用 heapify 的成本的三倍。所以你这样做不会有任何收获。

algorithm - 这种优化对 Heapsort 有多大价值？

1 回答 1

Related

Reference