5

我在 python 中工作。有没有办法计算字典中的值被多个键找到的次数,然后返回一个计数?

因此,例如,如果我有 50 个值并且我运行了一个脚本来执行此操作,我会得到一个看起来像这样的计数:

1: 23  
2: 15  
3: 7  
4: 5  

以上将告诉我 23 个值出现在 1 个键中,15 个值出现在 2 个键中,7 个值出现在 3 个键中,5 个值出现在 4 个键中。

另外,如果我的字典中每个键有多个值,这个问题会改变吗?

这是我的字典样本(它是细菌名称):

{'0': ['Pyrobaculum'], '1': ['Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium'], '3': ['Thermoanaerobacter', 'Thermoanaerobacter'], '2': ['Helicobacter', 'Mycobacterium'], '5': ['Thermoanaerobacter', 'Thermoanaerobacter'], '4': ['Helicobacter'], '7': ['Syntrophomonas'], '6': ['Gelria'], '9': ['Campylobacter', 'Campylobacter'], '8': ['Syntrophomonas'], '10': ['Desulfitobacterium', 'Mycobacterium']}

所以从这个样本中,有 8 个唯一值,我得到的理想反馈是:

1:4
2:3
3:1

所以 4 个细菌名称只在一个键中,3 个细菌在两个键中找到,1 个细菌在三个键中找到。

4

3 回答 3

6

因此,除非我读错了,否则您想知道:

  • 对于原始字典中的每个值,每个不同的值计数出现多少次?
  • 本质上,您想要的是字典中值的频率

我采用了其他答案不太优雅的方法,但已将问题分解为各个步骤:

d = {'0': ['Pyrobaculum'], '1': ['Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium'], '3': ['Thermoanaerobacter', 'Thermoanaerobacter'], '2': ['Helicobacter', 'Mycobacterium'], '5': ['Thermoanaerobacter', 'Thermoanaerobacter'], '4': ['Helicobacter'], '7': ['Syntrophomonas'], '6': ['Gelria'], '9': ['Campylobacter', 'Campylobacter'], '8': ['Syntrophomonas'], '10': ['Desulfitobacterium', 'Mycobacterium']}

# Iterate through and find out how many times each key occurs
vals = {}                       # A dictonary to store how often each value occurs.
for i in d.values():
  for j in set(i):              # Convert to a set to remove duplicates
    vals[j] = 1 + vals.get(j,0) # If we've seen this value iterate the count
                                # Otherwise we get the default of 0 and iterate it
print vals

# Iterate through each possible freqency and find how many values have that count.
counts = {}                     # A dictonary to store the final frequencies.
# We will iterate from 0 (which is a valid count) to the maximum count
for i in range(0,max(vals.values())+1):
    # Find all values that have the current frequency, count them
    #and add them to the frequency dictionary
    counts[i] = len([x for x in vals.values() if x == i])

for key in sorted(counts.keys()):
  if counts[key] > 0:
     print key,":",counts[key]

您还可以在 codepad 上测试此代码

于 2013-09-03T01:01:18.643 回答
5

如果我理解正确,您想计算字典值的计数。如果这些值可以按 计数collections.Counter,您只需要调用Counter字典值,然后再次调用第一个计数器的值。这是一个使用字典的示例,其中键range(100)和值在 0 到 10 之间是随机的:

from collections import Counter
d = dict(enumerate([str(random.randint(0, 10)) for _ in range(100)]))
counter = Counter(d.values())
counts_counter = Counter(counter.values())

编辑

将示例字典添加到问题后,您需要以稍微不同的方式进行第一次计数(d问题中的字典):

from collections import Counter
c = Counter()
for v in d.itervalues():
    c.update(set(v))
Counter(c.values())
于 2013-09-03T00:42:38.393 回答
2

你可以使用计数器

>>>from collections import Counter
>>>d = dict(((1, 1), (2, 1), (3, 1), (4, 2), (5, 2), (6, 3), (7, 3)))
>>>d
{1: 1, 2: 1, 3: 1, 4: 2, 5: 2, 6: 3, 7: 3}
>>>Counter(d.values())
Counter({1: 3, 2: 2, 3: 2})
于 2013-09-03T00:41:33.847 回答