python - 更新字典中的嵌套计数器

Question

我正在逐行浏览一个大的 CSV 文件。我想要做的是计算某个列中字符串的出现次数。我遇到麻烦的地方是我希望将计数器嵌套在字典中，其中外部字典的键是另一列的值。我需要这样做，否则数据将被错误地处理，因为有重复。

想象一下我的 CSV：

outerDictKey    CounterKey
apple     purple
apple     blue
pear    purple

所以基本上我想要：

dictionary = { apple:
                    counter({blue: 1
                     purple: 1})
                pear:
                   counter({purple: 1})
             }

我不知道该怎么做。

myCounter = Counter()
myKey = 'barbara'
counterKey = 'streisand'
largeDict = defaultdict(dict)       
largeDict[myKey] = {myCounter[counterKey] += 1}

直观上看，这似乎行不通，当然它会给出语法错误。

我也试过

largeDict[myKey][myCounter][counterKey]+=1

这会引发“TypeError：不可散列的类型：'Counter'”错误。

最后

>>> largeDict[myKey]=Counter()
>>> largeDict[myKey][myCounter][counterKey]+=1

仍然给出类型错误。那么如何增加嵌套在字典中的 Counter 呢？

score 5 · Accepted Answer

这将起作用：

myCounter = Counter()
largedict = { myKey:
                    {counterKey: myCounter
                     anotherKey: Value2}
             }

largedict[myKey][counterKey]['somethingyouwanttocount']+=1

Counter只是一个带有一些额外功能的字典。但是，作为字典，它不能是字典中的键，也不能是集合中的条目，这就解释了不可散列的异常。

或者，如果您要跟踪有关连贯实体的信息，而不是使用 nested dicts，则可以将信息（包括计数器）存储在对象中，并根据需要将对象放入 adict中。

如果每个值都是一个计数器，那么只需使用 defaultdict：

from collections import defaultdict, Counter
largedict = defaultdict(Counter)
largedict['apple']['purple']+=1

score 1 · Accepted Answer

如果你只是想count occurrences of the strings in a certain column，这还不够吗？

import collections
data = "Welcome to stack overflow. To give is to get."

print collections.Counter(data.split())

输出

Counter({'to': 2, 'give': 1, 'get.': 1, 'is': 1, 'Welcome': 1, 'To': 1, 'overflow.': 1, 'stack': 1})

python - 更新字典中的嵌套计数器

2 回答 2

Related

Reference