python - 为什么 `heapq.heapify` 实现有效？

Question

我试图了解如何将列表转换为最小堆。我做了我自己的简单递归实现，这似乎可行，但我很想了解它是如何heapq.heapify工作的。将数组表示为堆，策略是确保在所有不是叶子的索引处都满足堆不变量。这证明了实施的第一部分：

def heapify(x):
    for i in reversed(range(n//2)):
        _siftup(x, i)

所以我们希望这_siftup(x, i)应该确保在 index 处满足堆不变量i。

def _siftup(heap, pos):
    endpos = len(heap)
    startpos = pos
    newitem = heap[pos]
    # Bubble up the smaller child until hitting a leaf.
    childpos = 2*pos + 1    # leftmost child position
    while childpos < endpos:
        # Set childpos to index of smaller child.
        rightpos = childpos + 1
        if rightpos < endpos and not heap[childpos] < heap[rightpos]:
            childpos = rightpos
        # Move the smaller child up.
        heap[pos] = heap[childpos]
        pos = childpos
        childpos = 2*pos + 1
    # The leaf at pos is empty now.  Put newitem there, and bubble it up
    # to its final resting place (by sifting its parents down).
    heap[pos] = newitem
    _siftdown(heap, startpos, pos)

def _siftdown(heap, startpos, pos):
    newitem = heap[pos]
    # Follow the path to the root, moving parents down until finding a place
    # newitem fits.
    while pos > startpos:
        parentpos = (pos - 1) >> 1
        parent = heap[parentpos]
        if newitem < parent:
            heap[pos] = parent
            pos = parentpos
            continue
        break
    heap[pos] = newitem

这似乎首先向上移动我们开始的任何节点的最小子节点，直到节点的值是叶子（_siftup）。然后它向下移动这个叶子的父节点以找到叶子在堆中的值的正确位置（_siftdown）。为什么这比下面的天真的递归实现更“有效”？图书馆heapq提到了这种情况，但没有解释原因。

def build_min_heap(arr: List[int]):

    n = len(arr)

    for i in reversed(range(n//2)):
        _make_heap_invariant(arr, i)

def _make_heap_invariant(arr: List[int], i: int):

    n = len(arr)
    l_child_idx = 2*i + 1 if 2*i + 1 < n else None
    r_child_idx = 2*i + 2 if 2*i + 2 < n else None
    if l_child_idx and arr[i] >= arr[l_child_idx]:
        arr[i], arr[l_child_idx] = arr[l_child_idx], arr[i]
        _make_heap_invariant(arr, l_child_idx)
    if r_child_idx and arr[i] >= arr[r_child_idx]:
        arr[i], arr[r_child_idx] = arr[r_child_idx], arr[i]
        _make_heap_invariant(arr, r_child_idx)

score 0 · Accepted Answer

大多数计算机科学教科书都应该有一篇关于堆的文章。基本不变量是对于i堆中的任何索引，

heap[i] ≤ heap[2 * i + 1]if2 * i + 1是一个合法的索引，并且
heap[i] ≤ heap[2 * i + 2]if2 * i + 2是一个合法的合法索引。

这两个条件保证heap[0]是堆中最小的元素。

从堆中删除最小元素并向堆中添加元素都可以在 O(log n) 时间内完成。

python - 为什么 `heapq.heapify` 实现有效？

1 回答 1

Related

Reference