1

我有一个如下所示的哈希列表:

   [{'campaign_id': 'cid2504649263',
  'country': 'AU',
  'impressions': 3000,
  'region': 'Cairns',
  'utcdt': datetime.datetime(2013, 6, 4, 6, 0)},
 {'campaign_id': 'cid2504649263',
  'country': 'AU',
  'count': 9000,
  'region': 'Cairns',
  'utcdt': datetime.datetime(2013, 6, 4, 6, 0)},
 {'campaign_id': 'cid2504649263',
  'country': 'AU',
  'count': 3000,
  'region': 'Cairns',
  'utcdt': datetime.datetime(2013, 6, 4, 7, 0)}]

有两个哈希需要汇总,因为所有维度都相同,我需要对计数求和。那么......我将如何在 itertools 中使用 python groupby 来完成这项任务?还有什么办法吗?

   rolled_up = [{'campaign_id': 'cid2504649263',
  'count': 12000,
  'region': 'Cairns',
  'utcdt': datetime.datetime(2013, 6, 4, 6, 0)},
 {'campaign_id': 'cid2504649263',
  'country': 'AU',
  'count': 3000,
  'region': 'Cairns',
  'utcdt': datetime.datetime(2013, 6, 4, 7, 0)}]
4

2 回答 2

2

如果需要一起滚动的项目是连续的, groupby 就可以了。否则,您需要先对它们进行排序。我认为acollections.Counter会更适合你

>>> import datetime
>>> from collections import Counter
>>> C = Counter()
>>> L =     [{'campaign_id': 'cid2504649263',
...   'country': 'AU',
...   'count': 3000,            # <== changed this to "count"
...   'region': 'Cairns',
...   'utcdt': datetime.datetime(2013, 6, 4, 6, 0)},
...  {'campaign_id': 'cid2504649263',
...   'country': 'AU',
...   'count': 3000,
...   'region': 'Cairns',
...   'utcdt': datetime.datetime(2013, 6, 4, 6, 0)},
...  {'campaign_id': 'cid2504649263',
...   'country': 'AU',
...   'count': 3000,
...   'region': 'Cairns',
...   'utcdt': datetime.datetime(2013, 6, 4, 7, 0)}]
>>> for item in L:                        # The ... represents the rest of the key
...     C[item['campaign_id'], item['country'], ...,  item['utcdt']] += item['count']
...
C
Counter({('cid2504649263', 'AU', datetime.datetime(2013, 6, 4, 6, 0)): 6000, ('cid2504649263', 'AU', datetime.datetime(2013, 6, 4, 7, 0)): 3000})

然后将 Counter 转换回您的列表格式

于 2013-06-12T06:56:08.767 回答
0

有两个哈希需要汇总,因为所有维度都相同,我需要对计数求和。

如果这就是你想要的,怎么样:

from collections import defaultdict

d = defaultdict(int)

for i in hashes:
   d[i['campaign_id'],i['region']] += i['count']

for k in d:
    print k[0],d[k]
于 2013-06-12T07:03:40.630 回答