5

使用时如何跳过迭代中具有重复元素的元组itertools.product?或者比方说,无论如何不要在迭代中查看它们?因为如果列表数量太多,跳过可能会很耗时。

Example,
lis1 = [1,2]
lis2 = [2,4]
lis3 = [5,6]

[i for i in product(lis1,lis2,lis3)] should be [(1,2,5), (1,2,6), (1,4,5), (1,4,6), (2,4,5), (2,4,6)]

它不会有(2,2,5)(2,2,6)因为这里 2 是重复的。我怎样才能做到这一点?

4

3 回答 3

11

itertools通常适用于输入中的唯一位置,而不是唯一。因此,当您想要删除重复值时,您通常必须对itertools结果序列进行后处理,或者“自己动手”。因为在这种情况下后处理可能非常低效,所以请自己动手:

def uprod(*seqs):
    def inner(i):
        if i == n:
            yield tuple(result)
            return
        for elt in sets[i] - seen:
            seen.add(elt)
            result[i] = elt
            for t in inner(i+1):
                yield t
            seen.remove(elt)

    sets = [set(seq) for seq in seqs]
    n = len(sets)
    seen = set()
    result = [None] * n
    for t in inner(0):
        yield t

然后,例如,

>>> print list(uprod([1, 2, 1], [2, 4, 4], [5, 6, 5]))
[(1, 2, 5), (1, 2, 6), (1, 4, 5), (1, 4, 6), (2, 4, 5), (2, 4, 6)]
>>> print list(uprod([1], [1, 2], [1, 2, 4], [1, 5, 6]))
[(1, 2, 4, 5), (1, 2, 4, 6)]
>>> print list(uprod([1], [1, 2, 4], [1, 5, 6], [1]))
[]
>>> print list(uprod([1, 2], [3, 4]))
[(1, 3), (1, 4), (2, 3), (2, 4)]

这可以更有效,因为甚至从不考虑重复值(既不在输入迭代内,也不在它们之间)。

于 2013-11-02T17:49:47.700 回答
5
lis1 = [1,2]
lis2 = [2,4]
lis3 = [5,6]
from itertools import product
print [i for i in product(lis1,lis2,lis3) if len(set(i)) == 3]

输出

[(1, 2, 5), (1, 2, 6), (1, 4, 5), (1, 4, 6), (2, 4, 5), (2, 4, 6)]
于 2013-11-02T17:15:52.093 回答
4

itertools.combinations排序顺序中不会有重复的元素:

>>> lis = [1, 2, 4, 5, 6]
>>> list(itertools.combinations(lis, 3))
[(1, 2, 4), (1, 2, 5), (1, 2, 6), (1, 4, 5), (1, 4, 6), (1, 5, 6), (2, 4, 5), 
(2, 4, 6), (2, 5, 6), (4, 5, 6)]
于 2018-02-25T19:01:25.120 回答