2

I have a python code for word frequency count from a text file. The problem with the program is that it takes fullstop into account hence altering the count. For counting word i've used a sorted list of words. I tried to remove the fullstop using

 words = open(f, 'r').read().lower().split()  
 uniqueword = sorted(set(words))
 uniqueword = uniqueword.replace(".","") 

but i get error as

AttributeError: 'list' object has no attribute 'replace'

Any help would be appreciated :)

4

2 回答 2

1

您可以set使用列表推导在制作 之前处理单词:

words = [word.replace(".", "") for word in words]

您也可以在 ( uniquewords = [word.replace...]) 之后删除它们,但随后您将重新引入重复项。

请注意,如果您想计算这些单词,aCounter可能更有用:

from collections import Counter

counts = Counter(words)
于 2014-02-19T10:30:32.633 回答
1

你可能会更好

words = re.findall(r'\w+', open(f, 'r').read().lower())

它将抓取由一个或多个“单词字符”组成的所有字符串,并将忽略标点符号和空格。

于 2014-02-19T10:40:16.330 回答