我一直在实现自己的堆模块来帮助我理解堆数据结构。我了解它们是如何工作和管理的,但是在执行堆排序时,我的实现比标准 python heapq 模块慢得多(对于大小为 100,000 的列表,heapq 需要 0.6 秒,而我的代码需要 2 秒(最初是 2.6 秒,删掉它)通过从 percDown 中取出 len() 语句并传递长度,减少到 2s,因此它不必每次方法调用自身时都计算 len)。这是我的实现:
def percDown(lst, start, end, node):
#Moves given node down the heap, starting at index start, until the heap property is
#satisfied (all children must be larger than their parent)
iChild = 2 * start + 1
i = start
# if the node has reached the end of the heap (i.e. no children left),
# return its index (we are done)
if iChild > end - 1:
return start
#if the second child exists and is smaller than the first child, use that child index
#for comparing later
if iChild + 1 < end and lst[iChild + 1] < lst[iChild]:
iChild += 1
#if the smallest child is less than the node, it is the new parent
if lst[iChild] < node:
#move the child to the parent position
lst[start] = lst[iChild]
#continue recursively going through the child nodes of the
# new parent node to find where node is meant to go
i = percDown(lst, iChild, end, node)
return i
popMin:弹出最小值(lst[0])并重新排序堆
def popMin(lst):
length = len(lst)
if (length > 1):
min = lst[0]
ele = lst.pop()
i = percDown(lst, 0, length - 1, ele)
lst[i] = ele
return min
else:
return lst.pop()
heapify:将列表原地变成堆
def heapify(lst):
iLastParent = math.floor((len(lst) - 1) / 2)
length = len(lst)
while iLastParent >= 0:
ele = lst[iLastParent]
i = percDown(lst, iLastParent, length, lst[iLastParent])
lst[i] = ele
iLastParent -= 1
排序:使用上述方法对给定列表进行堆排序(非就地)
def sort(lst):
result = []
heap.heapify(lst)
length = len(lst)
for z in range(0, length):
result.append(heap.popMin(lst))
return result
我是否错误地增加了算法/堆创建的复杂性,或者仅仅是 python heapq 模块被高度优化?我有一种感觉是前者,因为 0.6s 与 2s 是一个巨大的差异。