python - 与其他算法相比，堆排序算法表现不佳

Question

我有两种heapsort算法。第一个是我写的，第二个是从某个网站上摘下来的。据我说，两者都有相同的逻辑，但第二个比第一个表现得更好。为什么会发生这种情况？我能看到的唯一区别是我的使用递归，而另一个使用迭代。仅此一项就可以成为差异化因素吗？

我的代码：

def heapify(arr,i,n):
    pivot = arr[i]   #the value of the root node
    left,right = (i<<1)+1,(i<<1)+2  #indices of the left and right subtree root nodes
    if right <= n-1:  #if right is within the array, so is left
        if arr[left] <= pivot and arr[right] <= pivot:
            return  #if both are less than the root node, it's already heapified
        maximum = left if arr[left] >= arr[right] else right #else find which child has a higher value
        arr[maximum],arr[i] = arr[i],arr[maximum]  #swap the root node with that child
        return heapify(arr,maximum,n)  #make the changed child the new root and recurse
    else:
      if left <= n-1:  #right is outside the array, so check for left only
        if arr[left] <= pivot:
            return
        arr[i],arr[left] = arr[left], arr[i]  #same logic as above
        return heapify(arr,left,n)
      else:
          return

def heapit(array,n):
    for i in range((len(array)-1)/2,-1,-1):  #all elements after (len(array)-1)/2 are the leaf nodes, so we have to heapify earlier nodes
        heapify(array,i,n)

def heapsort(array):
    n = len(array)
    for i in range(n,0,-1):
        heapit(array,i)  #make the array a heap
        array[0],array[i-1] = array[i-1],array[0]  #swap the root node with the last element

另一个代码：

def HeapSort(A):
     def heapify(A):
        start = (len(A) - 2) / 2
        while start >= 0:
            siftDown(A, start, len(A) - 1)
            start -= 1

     def siftDown(A, start, end):
        root = start
        while root * 2 + 1 <= end:
            child = root * 2 + 1
            if child + 1 <= end and A[child] < A[child + 1]:
                child += 1
            if child <= end and A[root] < A[child]:
                A[root], A[child] = A[child], A[root]
                root = child
            else:
                return

     heapify(A)
     end = len(A) - 1
     while end > 0:
        A[end], A[0] = A[0], A[end]
        siftDown(A, 0, end - 1)
        end -= 1

即使对于大小约为 100,000 的小型阵列，差异也变得很大。我通过将要排序的数组传递给函数来调用任一代码：HeapSort(list)或heapsort(list).

编辑：

我已经用这个heapsort函数替换了这个函数：

def heapsort(array):
     n = len(array)
     heapit(array,n)
     array[n-1],array[0] = array[0],array[n-1]
     for i in range(n-1):
       heapify(array,0,n-1-i)
       array[n-i-2],array[0] = array[0],array[n-i-2]

这提供了相当的性能，但它仍然较慢。对于 100 万美元的阵列，结果几乎是 20 秒：4 秒。还有什么可以做的？

score 5 · Accepted Answer

编辑：我在下面的评论可能解释了一个主要的减速，但最重要的是你的算法不是 heapsort。

在函数内部heapsort，您执行一个循环for i in range(n,0,-1)。那是您输入的大小的n迭代。n在那个循环中，你调用heapit, which loops for i in range((len(array)-1)/2,-1,-1); 这大致是n//2迭代。

n * (n // 2)= Θ( n²)。换句话说，你有一个算法至少需要二次时间，而第二种算法实现了真正的堆排序，它在 O( nlg n) 时间内运行。

/编辑

结合在模块级别定义的函数调用，很可能是递归导致性能下降。Python（至少是 CPython）没有针对递归程序进行优化，而是针对迭代程序进行了优化。对于中的每个递归调用heapify，CPython 必须执行以下七个字节码指令：

  9         158 LOAD_GLOBAL              0 (heapify)
            161 LOAD_FAST                0 (arr)
            164 LOAD_FAST                6 (maximum)
            167 LOAD_FAST                2 (n)
            170 CALL_FUNCTION            3
            173 RETURN_VALUE        
        >>  174 POP_TOP

（使用确定dis）。最后两条指令在递归调用完成后执行，因为Python 不执行尾调用优化。

虽然这看起来并不昂贵，但 aLOAD_GLOBAL必须至少进行一次哈希表查找才能找到，并且、和heapify的引用计数必须递增。当递归调用完成时，引用计数必须再次递减。Python 中的函数调用非常昂贵。heapifyarrmaximumi

正如import this所说，“平面优于嵌套”：尽可能选择迭代而不是递归。

python - 与其他算法相比，堆排序算法表现不佳

1 回答 1

Related

Reference