假设我有一个 Python 列表字典。我想找到所有具有共同项目的键组,以及每个这样的组对应的项目。
例如,假设项目是简单的整数:
dct = dict()
dct['a'] = [0, 5, 7]
dct['b'] = [1, 2, 5]
dct['c'] = [3, 2]
dct['d'] = [3]
dct['e'] = [0, 5]
这些组将是:
groups = dict()
groups[0] = ['a', 'e']
groups[1] = ['b', 'c']
groups[2] = ['c', 'd']
groups[3] = ['a', 'b', 'e']
这些群体的共同点是:
common = dict()
common[0] = [0, 5]
common[1] = [2]
common[2] = [3]
common[3] = [5]
为了解决这个问题,我相信构建一个像下面这样的矩阵是有价值的,但我不知道如何从这一点着手。是否有任何有助于解决此类问题的 Python 库?
| a b c d e |
|a| x x |
|b| x x x |
|c| x x x |
|d| x x |
|e| x x x |
更新
我试图总结@NickBurns 在函数中提供的解决方案,但我在重现该解决方案时遇到了问题:
dct = { 'a' : [0, 5, 7], 'b' : [1, 2, 5], 'c' : [3, 2], 'd' : [3], 'e' : [0, 5]}
groups, common_items = get_groups(dct)
print 'Groups', groups
print 'Common items', common_items
我得到:
Groups: defaultdict(<type 'list'>, {0: ['a', 'e'], 2: ['c', 'b'], 3: ['c', 'd'], 5: ['a', 'b', 'e']})
Common items: {0: None, 2: None, 3: None, 5: None}
这是功能
from collections import defaultdict
def common(query_group, dct):
""" Recursively find the common elements within groups """
if len(query_group) <= 1:
return
# Extract the elements from groups,
# Pull their original values from dct
# Get the intersection of these
first, second = set(dct[query_group[0]]), set(dct[query_group[1]])
# print(first.intersection(second))
return common(query_group[2:], dct)
def get_groups (dct):
groups = defaultdict(list)
for key, values in dct.items():
for value in values:
groups[value].append(key)
# Clean up the groups:
for key in groups.keys():
# i.e. the value is common to more than 1 group
if len(groups[key]) <= 1:
del groups[key]
# Identify common elements:
common_items = dict()
for k,v in groups.iteritems():
if len(v) > 1:
common_items[k] = common(v, dct)
return groups, common_items