30

我有一个计数器声明为:main_dict = Counter()并且值添加为main_dict[word] += 1. 最后,我想删除频率小于 15 的所有元素。是否有任何功能Counters可以做到这一点。

任何帮助表示赞赏。

4

7 回答 7

23
>>> from collections import Counter
>>> counter = Counter({'baz': 20, 'bar': 15, 'foo': 10})
>>> Counter({k: c for k, c in counter.items() if c >= 15})
Counter({'baz': 20, 'bar': 15})
于 2013-04-07T11:52:13.327 回答
22

不,您需要手动删除它们。使用itertools.dropwhile()可能会使这更容易一些:

from itertools import dropwhile

for key, count in dropwhile(lambda key_count: key_count[1] >= 15, main_dict.most_common()):
    del main_dict[key]

示范:

>>> main_dict
Counter({'baz': 20, 'bar': 15, 'foo': 10})
>>> for key, count in dropwhile(lambda key_count: key_count[1] >= 15, main_dict.most_common()):
...     del main_dict[key]
... 
>>> main_dict
Counter({'baz': 20, 'bar': 15})

通过使用dropwhile,您只需测试计数为 15 或以上的键;之后它将放弃测试并通过所有内容。这适用于排序most_common()列表。如果有很多值低于 15,则可以节省所有这些测试的执行时间。

于 2013-04-07T11:18:37.657 回答
16

Another method:

c = Counter({'baz': 20, 'bar': 15, 'foo': 10})
print Counter(el for el in c.elements() if c[el] >= 15)
# Counter({'baz': 20, 'bar': 15})
于 2013-04-07T11:50:05.713 回答
2

阈值为零时的优雅解决方案:

main_dict += Counter()
于 2019-04-23T15:16:57.453 回答
2

我可以建议另一种解决方案吗

from collections import Counter
main_dict = Counter({'baz': 20, 'bar': 15, 'foo': 10})  
trsh = 15

main_dict = Counter(dict(filter(lambda x: x[1] >= trsh, main_dict.items())))
print(main_dict)

>>> Counter({'baz': 20, 'bar': 15})

我也有同样的问题,但我需要从 Counter 中返回所有键的列表,其值超过某个阈值。去做这个

keys_list = map(lambda x: x[0], filter(lambda x: x[1] >= trsh, main_dict.items()))
print(keys_list) 

>>> ['baz', 'bar']
于 2018-02-05T16:18:26.143 回答
1

如何过滤计数器中计数大于或小于阈值的项目的示例

from collections import Counter
from itertools import takewhile, dropwhile


data = (
    "Here's a little song about Roy G. Biv. "
    "He makes up all the colors that you see where you live. "
    "If you know all the colors, sing them with me: "
    "red, orange, yellow, green, blue, indigo, violet all that you see."
)

c = Counter(data)

more_than_10 = dict(takewhile(lambda i: i[1] > 10, c.most_common()))
less_than_2 = dict(dropwhile(lambda i: i[1] >= 2, c.most_common()))

print(f"> 10 {more_than_10} \n2 < {less_than_2}")

输出:

> 10 {' ': 40, 'e': 23, 'o': 16, 'l': 15, 't': 12} 
2 < {"'": 1, 'R': 1, 'G': 1, 'B': 1, 'p': 1, 'I': 1, 'f': 1, ':': 1}
于 2019-04-23T19:33:33.600 回答
1

只需对字典项目进行列表理解:

[el for el in c.items() if el[1] >= 15]
于 2019-12-03T05:29:30.273 回答