3

我正在研究一种使用霍夫曼编码变体的基因组压缩算法。我在python2中有以下代码:

def makeHuffTree(trees):
   heapq.heapify(trees)
   while len(trees) > 1:
      childR, childL = heapq.heappop(trees), heapq.heappop(trees)
      parent = (childL[0] + childR[0], childL, childR)
      heapq.heappush(trees, parent)
   return trees[0]

我正在尝试在 python3 中运行它。但是,我收到如下类型错误:

TypeError: '<' not supported between instances of 'tuple' and 'str'

进入函数的输入是一个元组列表。说,

[(1, '35'), (1, '38'), (1, '50'), (1, 'K'), (4, 'N')]

我很困惑,为什么它在 python 2 中工作但在 python3 中没有。我尝试调试程序并将其缩小到出现错误的行。模块 heapq 的功能如下:

def heappop(heap):
    """Pop the smallest item off the heap, maintaining the heap invariant."""
    lastelt = heap.pop()    # raises appropriate IndexError if heap is empty
    if heap:
        returnitem = heap[0]
        heap[0] = lastelt
        _siftup(heap, 0)
        return returnitem
    return lastelt
def _siftup(heap, pos):
    endpos = len(heap)
    startpos = pos
    newitem = heap[pos]
    # Bubble up the smaller child until hitting a leaf.
    childpos = 2*pos + 1    # leftmost child position
    while childpos < endpos:
        # Set childpos to index of smaller child.
        rightpos = childpos + 1
        if rightpos < endpos and not heap[childpos] < heap[rightpos]:
            childpos = rightpos
        # Move the smaller child up.
        heap[pos] = heap[childpos]
        pos = childpos
        childpos = 2*pos + 1
    # The leaf at pos is empty now.  Put newitem there, and bubble it up
    # to its final resting place (by sifting its parents down).
    heap[pos] = newitem
    _siftdown(heap, startpos, pos)
def _siftdown(heap, startpos, pos):
    newitem = heap[pos]
    # Follow the path to the root, moving parents down until finding a place
    # newitem fits.
    while pos > startpos:
        parentpos = (pos - 1) >> 1
        parent = heap[parentpos]
        if newitem < parent:
            heap[pos] = parent
            pos = parentpos
            continue
        break
    heap[pos] = newitem

在在线的 siftup 函数中进行一些迭代后出现错误

        if rightpos < endpos and not heap[childpos] < heap[rightpos]:

我尝试打印 and 的类型rightposendpos但我没有看到 str 的实例 - 每次它只是元组。使用python2,该函数成功执行,经过一些步骤后产生以下结果。

input (trees)= [(1, '35'), (1, '38'), (1, '50'), (1, 'K'), (4, 'N')]

[(1, '35'), (1, '38'), (1, '50'), (1, 'K'), (4, 'N')]
Step 1:
[(1, '50'), (1, 'K'), (4, 'N'), (2, (1, '38'), (1, '35'))]
Step 2:
[(2, (1, '38'), (1, '35')), (4, 'N'), (2, (1, 'K'), (1, '50'))]
Step 3:
[(4, 'N'), (4, (2, (1, 'K'), (1, '50')), (2, (1, '38'), (1, '35')))]
Step 4:
(8, (4, (2, (1, 'K'), (1, '50')), (2, (1, '38'), (1, '35'))), (4, 'N'))
Step 5:

但是在python 3中,经过这么多步骤后我遇到了错误。

input (trees)= [(1, '35'), (1, '38'), (1, '50'), (1, 'K'), (4, 'N')]

[(1, '35'), (1, '38'), (1, '50'), (1, 'K'), (4, 'N')]
Step 1:
[(1, '50'), (1, 'K'), (4, 'N'), (2, (1, '38'), (1, '35'))]
Step 2:
[(2, (1, '38'), (1, '35')), (4, 'N'), (2, (1, 'K'), (1, '50'))]

之后,出现错误:“TypeError:'tuple'和'str'的实例之间不支持'<'”

我寻求帮助,使这段代码在 python3 中工作。

4

1 回答 1

0

所以,我想我已经找到了解决问题的方法。问题是由于:python 2 允许在字符串和元组之间进行比较,但不允许在 python3 之间进行比较。

更改第 267 行:

if rightpos < endpos and not heap[childpos] < heap[rightpos]:

if rightpos < endpos and not heap[childpos][0] < heap[rightpos][0]:

和第 212 行:

if newitem < parent:

if newitem[0] < parent[0]:

在图书馆https://github.com/python/cpython/blob/3.8/Lib/heapq.py

解决了我的问题。

我在本地覆盖了这些功能。

于 2020-08-25T04:59:09.687 回答