The actual way to solve this is to use Counter
, like this:
>>> from collections import Counter
>>> words = ['b','b','the','the','the','c']
>>> Counter(words).most_common()
[('the', 3), ('b', 2), ('c', 1)]
The other way to solve it, is by using a defaultdict
, which will work just like the Counter
example above:
>>> from collections import defaultdict
>>> d = defaultdict(int)
>>> for word in words:
... d[word] += 1
...
>>> d
defaultdict(<type 'int'>, {'the': 3, 'b': 2, 'c': 1})
No matter how you count the words, you can only write to the file once all words are counted; otherwise you are writing once for each "count", and as soon as the word appears more than once, you will have doubled out your output.
So, first collect the counts, then write them out.