我正在寻找一些帮助添加到我当前的代码中,我有两个用户名列表,这些列表看起来像这样:
Fishermen A:
George
Tom
Joel
Tom
Lance
Fishermen B:
George
Tom
Tom
我想要做的基本上是如果用户名出现在 Fisherman A 列表和 Fisherman B 列表中,然后计算它在两个列表中出现的次数。所以在这种情况下,代码将列出汤姆 4 次和乔治 2 次,否则什么也不做。我是编码的相对新手,因此将不胜感激任何评论和帮助。
我正在寻找一些帮助添加到我当前的代码中,我有两个用户名列表,这些列表看起来像这样:
Fishermen A:
George
Tom
Joel
Tom
Lance
Fishermen B:
George
Tom
Tom
我想要做的基本上是如果用户名出现在 Fisherman A 列表和 Fisherman B 列表中,然后计算它在两个列表中出现的次数。所以在这种情况下,代码将列出汤姆 4 次和乔治 2 次,否则什么也不做。我是编码的相对新手,因此将不胜感激任何评论和帮助。
fishermanA = ['George', 'Tom', 'Joel', 'Tom', 'Lance']
fishermanB = ['George', 'Tom', 'Tom']
a_set = set(fishermanA)
b_set = set(fishermanB)
inter = a_set.intersection(b_set)
for i in inter:
print(i, fishermanA.count(i) + fishermanB.count(i))
Output:
('George', 2)
('Tom', 4)
看起来像一份工作collections.Counter
:
>>> from collections import Counter
>>> l1 = ['George', 'Tom', 'Joel', 'Tom', 'Lance']
>>> l2 = ['George', 'Tom', 'Tom']
>>> Counter(filter((set(l1) & set(l2)).__contains__, l1 + l2))
Counter({'Tom': 4, 'George': 2})
这个怎么样:
from collections import defaultdict
fmenA = [
"George",
"Tom",
"Joel",
"Tom",
"Lance",
]
fmenB = [
"George",
"Tom",
"Tom",
]
countsA = defaultdict(int)
countsB = defaultdict(int)
for name in fmenA:
countsA[name] += 1
for name in fmenB:
countsB[name] += 1
print {
name: countsA[name] + countsB[name]
for name in countsA if name in countsB
}
--output:--
{'George': 2, 'Tom': 4}
#The following data is highly skewed against count()
print len(string.printable) #-->100
fmenA = list(string.printable)[:10]
fmenB = list(string.printable)[:10]
--------------------------------------
2.14819002151 defaultdict
1.860476017 count()
3.48084497452 Counter_arshajii
5.76169896126 Counter_jpmc26
fmenA = list(string.printable)[:20]
fmenB = list(string.printable)[:20]
--------------------------------------
3.87321305275 defaultdict
4.63102507591 count()
5.21796107292 Counter_arshajii
8.44607114792 Counter_jpmc26
fmenA = list(string.printable)[:40]
fmenB = list(string.printable)[:40]
--------------------------------------
7.59739494324 defaultdict
13.643941879272461 count()
9.3110909462 Counter_arshajii
15.3523819447 Counter_jpmc26
fmenA = list(string.printable)
fmenB = list(string.printable)
-------------------------------
18.7256119251 defaultdict
80.9080910683 count()
22.0006680489 Counter_arshajii
37.6448471546 Counter_jpmc26
import timeit
setup ="""
from collections import defaultdict
from collections import Counter
import string
fmenA = list(string.printable)
fmenB = list(string.printable)
"""
s1 = """
countsA = defaultdict(int)
countsB = defaultdict(int)
for name in fmenA:
countsA[name] += 1
for name in fmenB:
countsB[name] += 1
{
name: (countsA[name] + countsB[name])
for name in countsA if name in countsB
}
"""
s2 = """
a_set = set(fmenA)
b_set = set(fmenB)
inter = a_set.intersection(b_set)
{
name: fmenA.count(name) + fmenB.count(name)
for name in inter
}
"""
s3 = """
Counter(filter((set(fmenA) & set(fmenB)).__contains__, fmenA + fmenB))
"""
s4 = """
a = Counter(fmenA)
b = Counter(fmenB)
{k: a[k] + b[k] for k in a if a[k] > 0 and b[k] > 0}
"""
t = timeit.Timer(stmt=s1, setup=setup)
print(t.timeit(number=100000))
t = timeit.Timer(stmt=s2, setup=setup)
print(t.timeit(number=100000))
t = timeit.Timer(stmt=s3, setup=setup)
print(t.timeit(number=100000))
t = timeit.Timer(stmt=s4, setup=setup)
print(t.timeit(number=100000))
我同意 arshaji 这Counter
是要走的路,但在我看来,创建额外set
的 s 和直接访问魔法方法是不必要的。
>>> from collections import Counter
>>> l1 = ['George', 'Tom', 'Joel', 'Tom', 'Lance']
>>> l2 = ['George', 'Tom', 'Tom']
>>> a = Counter(l1)
>>> b = Counter(l2)
>>> counts = {k: a[k] + b[k] for k in a if a[k] > 0 and b[k] > 0}
>>> counts
{'George': 2, 'Tom': 4}
>>> for k in counts:
... print str(k) + ': ' + str(counts[k])
...
George: 2
Tom: 4
请注意,我们只迭代 one 中的键是可以的Counter
。关键必须在两个列表中才能让我们关心它,所以如果它在两个列表中,它将在Counter
我们迭代中。
与 Ankur Ankan 的解决方案相比,优势在于大型列表的效率。Ankur Ankan 的解决方案对每个公共元素的两个完整列表进行迭代。Counter
only 遍历每个列表一次,然后遍历一个Counter
。对于大型列表和大量公共元素,性能差异会非常大。对于小型列表,性能影响可以忽略不计。