python - 在 python 中更新我的字典

Question

这应该是一个简单的问题，但我无法理解它。我有一本字典，叫TD. 的{key1{key2:values}}TD是{1:{u'word':3, u'next':2, u'the':2},2:{...}...}wherekey1是文档，key2是文档中的一个词，是该词在文档value中出现的次数，使用该Counter方法获得。

我有大量文档，因此每个文档在 TD 中都有一个条目：

TD = {1:{u'word':2, u'next':1, u'the':5,...},
      2:{u'my':4, u'you':1, u'other':2,...},
      ...
      168:{u'word':1, u'person':1, u'and':8,...}}

我现在要做的是检查每个单词{1{...}}以查看它是否出现在其他文档中，并对每个文档重复此过程。对于每个文档中出现的单词，freq增加 1。我有一个名为的新字典Score，应该如下所示：

{1:{u'word':score, u'next':score,...}, 2:{u'my':score, u'you':score,...}...}

要获取此字典：

Score={}
count = 0
for x,i in TD[count].iteritems():
    freq=1
    num=1
    for y in TD[num].keys():
        if word in TF[num].keys():
            freq+=1
        num+=1
    Score[num]={x:(i*freq)}
    num+=1

这给了我以下输出：

{1:{u'word':score}, 2:{u'next':score}, 3:{u'the':score}...}

应该：

{1:{u'word':score, u'next':score, u'the':score,...}...}

我认为问题出在线路上Score[num]={x:(i*freq)}

score 3 · Accepted Answer

使用dict 视图查找文档之间的交集，然后使用 Counter 来计算它们的频率：

Score = {}
for id, document in TD.iteritems():
    counts = Score[id] = Counter()
    for otherid, otherdocument in TD.iteritems():
        if otherid == id:
            continue  # Skip current document
        counts.update(document.viewkeys() & otherdocument.viewkeys())

Score 中的每个条目都将计算文档中每个单词在其他文档中出现的频率。

如果您还需要在当前文档中包含字数（计数 + 1），只需删除if otherid == id测试。

在您自己的代码中，您混淆了numand count，但在 python 中，您通常在任何情况下都不需要手动增加循环计数器。

python - 在 python 中更新我的字典

1 回答 1

Related

Reference