python - Python 3 - 计算两个列表中的匹配项（包括重复项）

Question

首先，我是编程和python的新手，我看过这里但找不到解决方案，如果这是一个愚蠢的问题，请原谅我！

我有两个列表，我正在尝试确定第二个列表中的项目出现在第一个列表中的次数。

我有以下解决方案：

    list1 = ['black','red','yellow']
    list2 = ['the','big','black','dog']
    list3 = ['the','black','black','dog']
    p = set(list1)&set(list2)
    print(len(p))

除了第二个列表包含重复项之外，它工作正常。

即上面的list1和list2返回1，但list1和list3也是如此，理想情况下应该返回2

任何人都可以提出解决方案吗？任何帮助，将不胜感激！

谢谢，

亚当

score 9 · Accepted Answer

您看到此问题是因为您使用集合作为集合类型。集合有两个特征：它们是无序的（在这里无关紧要），并且它们的元素是唯一的。因此，当您将它们转换为集合时，您甚至会在找到它们的交集之前丢失列表中的重复项：

>>> p = ['1', '2', '3', '3', '3', '3', '3']
>>> set(p)
set(['1', '2', '3'])

您可以通过多种方式在此处执行您想要执行的操作，但您需要从查看 listcount方法开始。我会做这样的事情：

>>> list1 = ['a', 'b', 'c']
>>> list2 = ['a', 'b', 'c', 'c', 'c']
>>> results = {}
>>> for i in list1:
        results[i] = list2.count(i) 
>>> results
{'a': 1, 'c': 3, 'b': 1}

这种方法创建一个字典 ( results)，并为中的每个元素list1创建一个键results，计算它在中出现的次数list2，并将其分配给键的值。

编辑：正如 Lattyware 指出的那样，这种方法解决的问题与您提出的问题略有不同。一个真正基本的解决方案看起来像这样

>>> words = ['red', 'blue', 'yellow', 'black']
>>> list1 = ['the', 'black', 'dog']
>>> list2 = ['the', 'blue', 'blue', 'dog']
>>> results1 = 0
>>> results2 = 0
>>> for w in words:
        results1 += list1.count(w)
        results2 += list2.count(w)

>>> results1
1
>>> results2
2

这与我的第一个建议类似：它遍历主列表中的每个单词（这里我使用words），将它出现的次数添加list1到 counterresults1和list2to results2。

如果您需要的信息不仅仅是重复的数量，您将需要使用字典，或者更好的是模块Counter中的专用类型。collectionsCounter 旨在使我在上面的示例中所做的一切变得容易。

>>> from collections import Counter
>>> results3 = Counter()
>>> for w in words:
        results3[w] = list2.count(w)

>>> results3
Counter({'blue': 2, 'black': 0, 'yellow': 0, 'red': 0})
>>> sum(results3.values())
2

score 8 · Accepted Answer

清单 1 和清单 2 不应该返回 0 吗？或者你的意思是

list1 = ['black', 'red', 'yellow']

我想你想要的是

print(len([w for w in list2 if w in list1]))

使用集合的问题在于集合没有重复项。事实上，使用集合的通常原因是消除重复。当然，这只是你不想要的。

score 2 · Accepted Answer

我知道这是一个老问题，但如果有人想知道如何从一个或多个列表中获取匹配项或匹配项的长度。你也可以这样做。

a = [1,2,3]
b = [2,3,4]
c = [2,4,5]

要在两个列表中获得匹配，假设 a 和 b 将是

d = [value for value in a if value in b] # 2,3

对于这三个列表，将是

d = [value for value in a if value in b and value in c] # 2
len(d) # to get the number of matches

另外，如果您需要处理重复项。这将是预先将列表转换为集合的问题，例如

a  = set(a) # and so on

score 0 · Accepted Answer

如果您的意思是要计算 list2 中 list1 元素的频率，也许这个解决方案可以为您工作：

list1 = ['black', 'red', 'yellow']
list2 = ['the', 'big', 'black', 'dog']
list3 = ['the', 'black', 'black', 'dog']

首先我们可以计算元素的频率list2并构造一个dict，然后我们可以根据dict从dict构造一个subdict list1，并得到总频率你可以计算sub_dct的值：

# count the frequency of elements of list1 in list2
def cntFrequency(lst1,lst2):
    dct=dict(Counter(lst2))
    sub_dct={k:dct.get(k,0) for k in lst1}
    return sub_dct

结果是这样的：

from collections import Counter

cnt_dct=cntFrequency(list1,list2)
print cnt_dct
print sum(cnt_dct.values())

# Output
{'black': 1, 'yellow': 0, 'red': 0}
1

python - Python 3 - 计算两个列表中的匹配项（包括重复项）

4 回答 4

Related

Reference