python - Python - 在字典中查找值的交集

Question

我正在编写一个函数来处理布尔 AND 搜索中的多个查询。我有每个查询发生的文档的字典=query_dict

我想要 query_dict.values() 中所有值的交集：

query_dict = {'foo': ['doc_one.txt', 'doc_two.txt', 'doc_three.txt'],
              'bar': ['doc_one.txt', 'doc_two.txt'],
              'foobar': ['doc_two.txt']}

intersect(query_dict)

>> doc_two.txt

我一直在阅读有关交集的内容，但我发现很难将其应用于字典。

谢谢你的帮助！

score 13 · Accepted Answer

In [36]: query_dict = {'foo': ['doc_one.txt', 'doc_two.txt', 'doc_three.txt'],
              'bar': ['doc_one.txt', 'doc_two.txt'],
              'foobar': ['doc_two.txt']}

In [37]: reduce(set.intersection, (set(val) for val in query_dict.values()))
Out[37]: set(['doc_two.txt'])

在 [41] 中：query_dict = {'foo': ['doc_one.txt', 'doc_two.txt', 'doc_three.txt'], 'bar': ['doc_one.txt', 'doc_two.txt'], 'foobar': ['doc_two.txt']}

set.intersection(*(set(val) for val in query_dict.values()))也是一个有效的解决方案，虽然它有点慢：

In [42]: %timeit reduce(set.intersection, (set(val) for val in query_dict.values()))
100000 loops, best of 3: 2.78 us per loop

In [43]: %timeit set.intersection(*(set(val) for val in query_dict.values()))
100000 loops, best of 3: 3.28 us per loop

score 0 · Accepted Answer

另一种方式

first = query_dict.values()[0]
rest = query_dict.values()[1:]
print [t for t in set(first) if all(t in q for q in rest)]

python - Python - 在字典中查找值的交集

2 回答 2

Related

Reference