python - 按产品顺序获取列表的每个可能子集的算法，无需构建和排序整个列表（即生成器）

Question

实际上，我有一组具有概率的对象，我想查看它们中的每个可能组，按照假设它们是独立的，它们都是真实的可能性有多大 - 即按降序排列子集元素的乘积 - 如果概率相同，则按长度顺序排列（因此 (1, 0.5) 在 (0.5) 之后）。

示例：如果我有[ 1, 0.5, 0.1 ]我想要[ (), (1), (0.5), (1, 0.5), (0.1), (1, 0.1), (0.5, 0.1), (1, 0.5, 0.1) ]

从本质上讲，这意味着我想按顺序迭代一组元素的幂集，并且我可以相当容易地生成它，对其进行排序并完成。然而，powersets 变得相当大很快，我希望我通常会想要第一个子集，我宁愿不生成数千个子集的列表，对它们进行排序，然后再看第三个。这就是 python 生成器希望拯救这一天的地方！

更正式的问题说明，我需要找到一种方法sorted(powerset(input), key = lambda l : reduce (lambda (p, n), e: (p * e, n-1), l, (1, 0)), reverse=True)，作为生成器，或者以其他方式让我避免构建和排序整个列表。

我有理由确定这与背包问题以及子产品问题有关，但我真的很难找到一个很好的算法来解决它，非常感谢您的帮助:-)。在最坏的情况下（一直迭代到最后），它比构建+排序整个事情要慢并不是问题，它只需要更好的最佳情况（比如说，在前 10% 内）性能。

score 5 · Accepted Answer

好问题，解决起来相当棘手。我也想不出一种按顺序生成组合的方法，但我使用强大的heapq（又名优先级队列）来保持候选者的排序。

from heapq import heappush, heappop
import operator

def prob(ps):
    """ returns the probability that *not* all ps are True """
    return 1-reduce(operator.mul, ps)

def gen(ps):
    # turn each to a tuple
    items = ((x,) for x in sorted(ps, reverse=True))

    # create a priority queue, sorted by probability
    pq = [(prob(x),x) for x in items]

    # because you wanted this
    yield ()

    # as long as there are valid combinations
    while pq:
        # get the best un-yielded combination, the pq makes sure of that
        p, x = heappop(pq)
        yield x

        # generate all the combinations from this item
        for other in ps:

            # keeping the tuples sorted -> unique combinations
            if other < x[-1]:

                # create a new combination
                new = x+(other,)
                item = prob(new), new

                # add it to the queue
                heappush(pq,item)


a = [1, 0.1, 0.5] 
print list(gen(a))

python - 按产品顺序获取列表的每个可能子集的算法，无需构建和排序整个列表（即生成器）

1 回答 1

Related

Reference