python - 如何更新一个集合？

Question

似乎使用 update 应该很简单，而且我认为我使用它正确，所以它必须是处理类型或其他东西的错误。

但无论如何，这里是坐：

我正在为 Coursera 课程做课程作业（不用说，答案最小化或遮挡代码最有帮助！）并且被困在最后一个问题上。任务是返回一个集合，其中包含包含查询中所有单词的所有文档。该函数接受一个 inverseIndex，一个包含单词作为键的字典和包含这些单词作为值的文档，例如：{'a':[0,1],'be':[0,1,4].....}

我尝试实现的方法非常简单：获取一组集合，其中每个集合都包含文档 ID 列表，然后调用 .intersections(sets) 将集合合并到仅包含文档的集合中包含查询中所有单词的文档的 ID。

def andSearch(inverseIndex, query):

    sets = set()
    s = set()
    for word in query:
        s.update(inverseIndex[word])
        print(inverseIndex[word])
    print s
    s.intersection(*sets)

    return s

不幸的是，这会返回 inverseIndex 中的所有文档，而它应该只返回索引“3”。

终端输出：

[0, 1, 2, 3, 4]
[0, 1, 2, 3]
[0, 1, 2, 3, 4]
[0, 1, 2, 3]
[0, 1, 3, 4]
[2, 3, 4]
set([0, 1, 2, 3, 4])

怎么了？

非常感谢！

sets = []
s = set()
for word in query:
    sets.append(inverseIndex[word])
print sets
s.intersection(*sets)

return s

输出：

[[0, 1, 2, 3, 4], [0, 1, 2, 3], [0, 1, 2, 3, 4], [0, 1, 2, 3], [0, 1, 3, 4], [2, 3, 4]]
set([])
logout

score 2 · Accepted Answer

您update在循环内使用。因此，在每次迭代中，您都将新页面添加到s. 但是您需要将这些页面相交，因为您需要这些页面，每个页面都包含所有单词（不是“至少一个单词”）。所以你需要intersect在每次迭代而不是更新。

另外，我根本不明白你为什么需要sets。

这应该有效：

def andSearch(inverseIndex, query):
    return set.intersection(*(set(inverseIndex[word]) for word in query))

这只会产生sets 的数组：

>>> [set(ii[word]) for word in query]
[set([0, 1]), set([0, 1, 4])]

然后我只是打电话set.intersection让他们全部相交。

关于您的问题更新。

它发生是因为s是空的。

考虑这个例子：

>>> s = set()
>>> s.intersection([1,2,3],[2,3,4])
set([])

要相交集，只需使用set.intersection. 但它只接受集合作为参数。因此，您应该将页面列表转换为页面集，或者将页面作为集合保存在字典中。

python - 如何更新一个集合？

1 回答 1

Related

Reference