python - 我有一个单词列表。我想添加一个与每个单词关联的计数器变量。我该怎么做呢？

Question

我有一个单词列表，假设它是

['a', 'b', 'c', 'd']

我有一个文档，其中我已经将文本文件预处理为矩阵，它是这样的：

a,b,c,d
0,1,1,0
1,1,0,0
1,1,1,1

其中 1 是句子中存在该词，而 0 是句子中不存在该词。我想逐行遍历该矩阵，并增加与上面的原始单词列表相关的某种计数器，这样我就可以知道在最后的句子中找到了每个单词的数量。

我怎样才能做到这一点？我必须创建关联数组还是二维数组？有没有办法在与我可以递增的每个单词关联的数组中创建一个新变量？

谢谢！

score 3 · Accepted Answer

您所要做的就是sum每一列，因为它只是 0 和 1！

import numpy as np
array = numpy.array((matrix))
answer = np.apply_along_axis(sum,0,array[1::])
my_dict = dict(zip(matrix[0],answer))

现在你有了一个字典，其中键是单词，值是出现的总数！

score 3 · Accepted Answer

我不想对密钥进行硬编码，所以可能是这样的：

import csv
from collections import Counter

with open("abcd.txt", "rb") as fp:
    reader = csv.DictReader(fp)
    c = Counter()
    for row in reader:
        c.update({k: int(v) for k,v in row.iteritems()})

产生

>>> c
Counter({'b': 3, 'a': 2, 'c': 2, 'd': 1})

score 3 · Accepted Answer

您可以使用collections.Counter来统计字数：

>>> from collections import Counter
>>> filedata = '''\
0,1,1,0
1,1,0,0
1,1,1,1
'''
>>> counter = Counter()
>>> for line in filedata.splitlines():
    a, b, c, d = map(int, line.split(','))
    counter['a'] += a
    counter['b'] += b
    counter['c'] += c
    counter['d'] += d


>>> counter
Counter({'b': 3, 'a': 2, 'c': 2, 'd': 1})

score 2 · Accepted Answer

如果您已经有描述的矩阵，您可以这样做：

mat=[['a','b','c','d'],
     [ 0,  1,  1,  0],
     [ 1,  1,  0,  0],
     [ 1,  1,  1,  1]]

print {t[0]:sum(t[1:]) for t in zip(*mat)}

印刷：

{'a': 2, 'c': 2, 'b': 3, 'd': 1}

score 2 · Accepted Answer

from collections import defaultdict
with open("abc") as f:
    next(f)                 # skip header
    dic = defaultdict(int) 
    for line in f:
        for x,y in zip("abcd",map(int,line.split(","))):
            dic[x] += y
    print dic

输出：

defaultdict(<type 'int'>, {'a': 2, 'c': 2, 'b': 3, 'd': 1})

使用collections.Counter：

from collections import Counter
with open("abc") as f:
    next(f)
    c = Counter()
    for line in f:
        c.update( dict(zip ("abcd", map(int,line.split(",")) )) )
    print c

输出：

Counter({'b': 3, 'a': 2, 'c': 2, 'd': 1})

python - 我有一个单词列表。我想添加一个与每个单词关联的计数器变量。我该怎么做呢？

5 回答 5

Related

Reference