python - heapq.merge() 如何与无限生成器一起工作？

Question

我想了解heapq.merge()无限生成器是如何工作的。考虑这个例子：

>>> from heapq import merge
>>> from itertools import count
>>> m = merge(count(0, 2), count(1, 2))
>>> for _ in range(10):
...     print(next(m))
...
0
1
2
3
4
5
6
7
8
9

文档声明它不会一次将数据全部拉入内存。但是它是如何消耗每个无限生成器的呢？

score 1 · Accepted Answer

这种函数的一个非常简单的实现可能如下所示。但是请注意，为了简单起见，它不处理任何特殊（和不那么特殊）的情况，例如空或耗尽的可迭代对象。

def merge(*iterables):
    heap = [(next(it), i) for i, it in enumerate(iterables)]
    heapq.heapify(heap)
    while heap:
        val, i = heapq.heappop(heap)
        yield val
        heapq.heappush(heap, (next(iterables[i]), i))

它是这样工作的：

从每个排序的迭代中获取第一个元素，以及该迭代在列表中的索引
从该堆中产生下一个最小的元素
从可迭代对象中添加下一个元素，其索引与刚刚生成到堆中的元素具有相同的索引

The actual implementation is a bit more involved, but seems to work roughly along the same lines. You can get the location of your local source with heapq.__file__, which on my system is /usr/lib/python3.6/heapq.py, and check yourself.

python - heapq.merge() 如何与无限生成器一起工作？

1 回答 1

Related

Reference