2

我在 Python 中有一长串列表,看起来像这样:

myList=[

('a',[1,2,3,4,5]),
('b',[6,7,8,9,10]),
('c',[1,3,5,7,9]),
('d',[2,4,6,8,10]),
('e',[4,5,6,7,8])

]

我想详尽地列举共同的价值观

('a:b', ),
('a:c', [1,3,5]),
('a:d', [2,4]),
('a:e', [4,5]),
('b:c', [7,9]),
('b:d', [6,8,10]),

('a:c:e', [5]),
('b:c:e', [7]),
('b:d:e', [6,8]),

对于四人、五人、六人的小组也是如此,直到确定所有共同值(假设列表更长)

这可能使用itertools库或集合或以上的组合吗?

我一直在尝试编写一个函数,为我生成的每个新列表循环遍历原始列表,但进展并不顺利!

这是我所拥有的:

def findCommonElements(MyList):

    def sets(items):
        for name, tuple in items:
            yield name, set(tuple)

    def matches(sets):
       for a, b in combinations(sets, 2):
           yield ':'.join([a[0], b[0]]), a[1] & b[1]

    combinationsSet=list(matches(sets(keywordCount)))

    combinationsList=[]
    for pair,tup in combinationsSet:
        setList=list(tup)
        combinationsList.append((pair, len(setList), setList))
    combinationsList=sorted(combinationsList,key=lambda x: x[1], reverse=True) #this just sorts the list by the number of common elements

    return combinationsList
4

4 回答 4

2

我想你可以尝试使用itertools.combinationswithitertools.chain

不是很好的例子,但它应该可以工作。我将 itertools在这里使用和生成器:

lengthes = xrange(2, len(myList)+1)
combinations_list = (itertools.combinations(myList, i) for i in lengthes)
combinations = itertools.chain.from_iterable(combinations_list)
def find_intersection(lists):
    res = set(lists[0])
    for data in lists:
        res &= set(data)
    return res
result = [(':'.join(i), list(find_intersection(v))) for i, v in (zip(*x) for x in combinations)]

要不就itertools.combinations

def findCommonElements(MyList):

    combinationsList=[]

    for seq_len in xrange(2, len(MyList)+1):
        for combination in combinations:
            for indexes, values in zip(*combination):
                intersection = reduce(lambda x, y: x & set(y[1]), 
                                      values, set(values[0]))
                if intersection:
                    combinationsList.appen(':'.join(indexes), intersection)
        return combinationsList
于 2013-11-06T11:43:48.857 回答
1

这是使用我刚刚制作的字典的解决方案:

def iter_recursive_common_elements(lists, max_depth=None):
    data = [{k:set(v) for k,v in lists.iteritems()}] # guarantee unique
    depth = 0

    def get_common_elements(lists, base):
        d = {}
        for k, v in lists.iteritems():
            merged = k.split(':')
            potential = set(base).difference(merged)
            for target in potential:
                d[':'.join(sorted(merged+[target]))] = v.intersection(base[target])
        return d if d else None

    while True:
        ret = get_common_elements(data[depth], data[0])
        if not ret:
            break
        data.append(ret)
        depth += 1
        yield data[depth]
        if max_depth and depth > max_depth:
            break

使用它很简单:

lists = {'a':[1,2,3,4,5],
        'b':[6,7,8,9,10],
        'c':[1,3,5,7,9],
        'd':[2,4,6,8,10],
        'e':[4,5,6,7,8]}

for x in iter_recursive_common_elements(lists):
    print x

>>> 
{'d:e': set([8, 4, 6]), 'a:b': set([]), 'a:c': set([1, 3, 5]), 'a:d': set([2, 4]), 'a:e': set([4, 5]), 'b:e': set([8, 6, 7]), 'b:d': set([8, 10, 6]), 'b:c': set([9, 7]), 'c:d': set([]), 'c:e': set([5, 7])}
{'a:b:d': set([]), 'a:b:e': set([]), 'a:b:c': set([]), 'a:c:e': set([5]), 'c:d:e': set([]), 'a:c:d': set([]), 'b:c:d': set([]), 'b:c:e': set([7]), 'b:d:e': set([8, 6]), 'a:d:e': set([4])}
{'b:c:d:e': set([]), 'a:b:c:e': set([]), 'a:b:c:d': set([]), 'a:c:d:e': set([]), 'a:b:d:e': set([])}
{'a:b:c:d:e': set([])}

还可以清理输出以匹配更多您想要的内容:

for x in iter_recursive_common_elements(lists):
    for k, v in sorted(x.items()):
        if v:
            print '(%s) : %s' % (k.replace(':', ', '), list(v))

>>> 
(a, c) : [1, 3, 5]
(a, d) : [2, 4]
(a, e) : [4, 5]
(b, c) : [9, 7]
(b, d) : [8, 10, 6]
(b, e) : [8, 6, 7]
(c, e) : [5, 7]
(d, e) : [8, 4, 6]
(a, c, e) : [5]
(a, d, e) : [4]
(b, c, e) : [7]
(b, d, e) : [8, 6]
于 2013-11-06T13:17:41.187 回答
0

这样的事情怎么样?

from itertools import combinations

myList = [
    ('a', [1, 2, 3, 4, 5]),
    ('b', [6, 7, 8, 9, 10]),
    ('c', [1, 3, 5, 7, 9]),
    ('d', [2, 4, 6, 8, 10]),
    ('e', [4, 5, 6, 7, 8]),
]


def print_commons(mList):

    letters = map(lambda l: l[0], mList)
    mdict = dict(mList)

    for i in range(2, len(letters) + 1):
        for comb in combinations(letters, i):  # generate all possible combinations
            sequence = [mdict[letter] for letter in comb]  # get the corresponding lists 
            uniques = reduce(lambda x, y: set(x).intersection(y), sequence)  # reduce the lists until only the common elements remain
            print('{} : {}'.format(comb, list(uniques)))

print_commons(myList)

('a', 'b') : []
('a', 'c') : [1, 3, 5]
('a', 'd') : [2, 4]
('a', 'e') : [4, 5]
('b', 'c') : [9, 7]
('b', 'd') : [8, 10, 6]
('b', 'e') : [8, 6, 7]
('c', 'd') : []
('c', 'e') : [5, 7]
('d', 'e') : [8, 4, 6]
('a', 'b', 'c') : []
('a', 'b', 'd') : []
('a', 'b', 'e') : []
('a', 'c', 'd') : []
('a', 'c', 'e') : [5]
('a', 'd', 'e') : [4]
('b', 'c', 'd') : []
('b', 'c', 'e') : [7]
('b', 'd', 'e') : [8, 6]
('c', 'd', 'e') : []
('a', 'b', 'c', 'd') : []
('a', 'b', 'c', 'e') : []
('a', 'b', 'd', 'e') : []
('a', 'c', 'd', 'e') : []
('b', 'c', 'd', 'e') : []
('a', 'b', 'c', 'd', 'e') : []
于 2013-11-06T11:57:18.177 回答
0

我把它分成三个部分。一,您的列表笨拙,最好将其存储为字典,以字母为键,一组值作为值:

def make_dict(myList):
    return dict((letter, set(values)) for letter, values in myList)

第二,所有可能的组合。我认为长度为 1 的组合(字母本身)并不有趣,因此您希望长度为 2 的组合最大:

from itertools import combinations
def all_combinations(letters):
   for length in range(2, len(letters) + 1):
       for combination in combinations(letters, length):
           yield combination

第三,一个给定创建字典的函数和一个组合,产生所有字母共有的数字:

def common_values(the_dict, combination):
    # This can be made shorter with reduce(), but I hate it
    values_so_far = the_dict[combination[0]]
    for letter in combination[1:]:
        value_so_far = values_so_far & the_dict[letter]
    return values_so_far

然后这三个可以很容易地结合起来:

the_dict = make_dict(myList)
for combination in all_combinations(the_dict):
    print combination, common_values(the_dict, combination)
于 2013-11-06T13:30:01.300 回答