1

我有一个看起来很可笑的列表,像这样。

[['Biking', '10'], ['Biking|Gym', '14'], ['Biking|Gym|Hiking', '9'], ['Biking|Gym|Hiking|Running', '27']]

我想把它变成 ['Type', total, %] 的格式,像这样:

[['Biking',60,'34.7%'],['Gym',50,'28.9%'],['Hiking',36,'20.8%'],['Running',27,'15.6%']]

我确定我正在以最困难的方式进行此操作-有人可以为我指出更好的方向吗?我以前使用过 itertools.groupby ,这似乎是一个不错的地方,但我不确定如何在这种情况下实现。

# TODO: This is totally ridiculous.
running = 0
hiking = 0
gym = 0
biking = 0
no_exercise = 0

for r in exercise_types_l:
    if 'Running' in r[0]:
        running += int(r[1])
    if 'Hiking' in r[0]:
        hiking += int(r[1])
    if 'Gym' in r[0]:
        gym += int(r[1])
    if 'Biking' in r[0]:
        biking += int(r[1])
    if 'None' in r[0]:
        no_exercise += int(r[1])

total = running + hiking + gym + biking + no_exercise

l = list()
l.append(['Running', running, '{percent:.1%}'.format(percent=running/total)])
l.append(['Hiking', hiking, '{percent:.1%}'.format(percent=hiking/total)])
l.append(['Gym', gym, '{percent:.1%}'.format(percent=gym/total)])
l.append(['Biking', biking, '{percent:.1%}'.format(percent=biking/total)])
l.append(['None', no_exercise, '{percent:.1%}'.format(percent=no_exercise/total)])

l = sorted(l, key=lambda r: r[1], reverse=True)
4

3 回答 3

2

给定一个初始列表,例如

>>> test_list = [['Biking', '10'], ['Biking|Gym', '14'], ['Biking|Gym|Hiking', '9'], ['Biking|Gym|Hiking|Running', '27']]

你可以先弥补一个defaultdict总结的价值(得到你的最终结果的第二个元素),像

>>> from collections import defaultdict
>>> final_dict = defaultdict(int)
>>> for keys, values in test_list:
        for elem in keys.split('|'):
            final_dict[elem] += int(values)


>>> final_dict
defaultdict(<type 'int'>, {'Gym': 50, 'Biking': 60, 'Running': 27, 'Hiking': 36})

然后,您可以使用列表推导来获得最终结果。

>>> final_sum = float(sum(final_dict.values()))
>>> [(elem, num, str(num/final_sum)+'%') for elem, num in final_dict.items()]
[('Gym', 50, '0.28901734104%'), ('Biking', 60, '0.346820809249%'), ('Running', 27, '0.156069364162%'), ('Hiking', 36, '0.208092485549%')]

因为,您希望对它们进行排序和格式化,将最终结果更改为。

>>> [(elem, num, '{:.1%}'.format(num/final_sum)) for elem, num in final_dict.items()]
[('Gym', 50, '28.9%'), ('Biking', 60, '34.7%'), ('Running', 27, '15.6%'), ('Hiking', 36, '20.8%')]
>>> from operator import itemgetter
>>> sorted([(elem, num, '{:.1%}'.format(num/final_sum)) for elem, num in final_dict.items()], key = itemgetter(1), reverse=True)
[('Biking', 60, '34.7%'), ('Gym', 50, '28.9%'), ('Hiking', 36, '20.8%'), ('Running', 27, '15.6%')]
于 2013-08-06T16:53:53.763 回答
1

你可以在collections.defaultdict这里使用。dict 在这里是一种更好的数据结构,因为您可以访问与任何类型相关'Type'的值。O(1)

>>> from collections import defaultdict
>>> lis = [['Biking', '10'], ['Biking|Gym', '14'], ['Biking|Gym|Hiking', '9'],      ['Biking|Gym|Hiking|Running', '27']]
>>> total = 0
>>> dic  = defaultdict(lambda :[0])
for keys, val in lis:
    keys = keys.split('|')
    val = int(val)
    total += val*len(keys)
    for k in keys:
        dic[k][0] += val
...         
for k,v in dic.items():
    dic[k].append(format(v[0]/float(total), '.2%'))
...     
>>> dic
defaultdict(<function <lambda> at 0xb60e772c>,
{'Gym': [50, '28.90%'],
 'Biking': [60, '34.68%'],
 'Running': [27, '15.61%'],
 'Hiking': [36, '20.81%']})

访问值:

>>> dic['Biking']
[60, '34.68%']
>>> dic['Hiking']
[36, '20.81%']

另一种选择是使用 dict 作为值而不是列表:

>>> dic = defaultdict(lambda :dict(val = 0))
>>> total = 0
for keys, val in lis:
    keys = keys.split('|')
    total += int(val)*len(keys)
    for k in keys:
        dic[k]['val'] += int(val)
...         
for k,v in dic.items():
    dic[k]['percentage'] = format(v['val']/float(total), '.2%')
...     
>>> dic
defaultdict(<function <lambda> at 0xb60e7b8c>, 
{'Gym': {'percentage': '28.90%', 'val': 50},
 'Biking': {'percentage': '34.68%', 'val': 60},
 'Running': {'percentage': '15.61%', 'val': 27},
 'Hiking': {'percentage': '20.81%', 'val': 36}})

访问值:

#Return percentage related to 'Gym'
>>> dic['Gym']['percentage']
'28.90%'
#return the total sum of 'Biking'
>>> dic['Biking']['val']
60
于 2013-08-06T16:57:32.670 回答
1

也许是这样的(注意:您可以使用默认值为 0 的 collections.defaultdict,而不是使用 data.get 的东西..)?

sum=0
data={}
for extype, value in exercise_types_1:
   for item in extype.split('|'):
       sum += value
       data[item]=data.get(item,0)+value
l=[]
for k,v in data.iteritems():
   l.append([k,v, '{percent:.1%}'.format(percent=v/sum)])

l=sorted(l, key=lambda r: r[1], reverse=True)
于 2013-08-06T16:57:47.487 回答