使用Counter
模块。
from collections import Counter
s = "This is a sentence this is a this is this"
c = Counter(s.split())
#s.split() is an array of words, it splits it at each space if no parameter is given to split on
print c
>>> Counter({'is': 3, 'this': 3, 'a': 2, 'This': 1, 'sentence': 1})
但是,这对于句点和大写字母将无法“正确”工作。您可以简单地删除单词末尾的句点以正确计数,并使所有内容都小写/大写以使其不区分大小写。
你可以摆脱这两个问题:
s1 = "This is a sentence. This is a. This is. This."
s2 = ""
for word in s1.split():
#punctuation checking, you can make this more robust through regex if you want
if word.endswith('.') or word.endswith('!') or word.endswith('?'):
s2 += word[:-1] + " "
else:
s2 += word + " "
c = Counter(s2.lower().split())
print c
>>> Counter({'this': 4, 'is': 3, 'a': 2, 'sentence': 1})