据我了解,由于count()为每个元素再次迭代列表,所以速度较慢。由于 Counter 和 dict 构造都遍历列表一次,所以 dict 构造和计数器时间结果不应该相似吗?
我使用https://stackoverflow.com/a/23909767/7125235作为获取时间值的代码参考。
import timeit
if __name__ == "__main__":
    code_block = """seen = {}
for i in l:
    if seen.get(i):
        seen[i] += 1
    else:
        seen[i] = 1
"""
    setup = """import random
import string
from collections import Counter
n=1000
l=[random.choice(string.ascii_letters) for x in range(n)]
"""
    t1 = timeit.Timer(
        stmt="Counter(l)",
        setup=setup,
    )
    t2 = timeit.Timer(
        stmt="[[x,l.count(x)] for x in set(l)]",
        setup=setup,
    )
    t3 = timeit.Timer(
        stmt=code_block,
        setup=setup,
    )
    print("Counter(): ", t1.repeat(repeat=3, number=10000))
    print("count():   ", t2.repeat(repeat=3, number=10000))
    print("seen{}:   ", t3.repeat(repeat=3, number=10000))
输出:
运行1:
Counter():  [0.32974308, 0.319977907, 0.301750341]
count():    [6.424047524000001, 6.417152854, 6.450776530000001]
seen{}:    [1.1089669810000018, 1.099655232, 1.116015376]
   
运行 2:
Counter():  [0.322483783, 0.32464020800000004, 0.33498838900000005]
count():    [6.3235339029999995, 6.48233445, 6.543396192000001]
seen{}:    [1.1192663550000006, 1.1072084830000009, 1.1155270229999985]