python - 通过两个元素的集合合并两个数组

Question

我有一个包含偶数个整数的数组。该数组表示标识符和计数的配对。元组已经按标识符排序。我想将其中一些数组合并在一起。我已经想到了几种方法来做到这一点，但它们相当复杂，我觉得可能有一种简单的方法可以用 python 来做到这一点。

IE：

[<id>, <count>, <id>, <count>]

输入：

[14, 1, 16, 4, 153, 21]
[14, 2, 16, 3, 18, 9]

输出：

[14, 3, 16, 7, 18, 9, 153, 21]

score 8 · Accepted Answer

将这些存储为字典而不是列表会更好（不仅出于此目的，还用于其他用例，例如提取单个 ID 的值）：

x1 = [14, 1, 16, 4, 153, 21]
x2 = [14, 2, 16, 3, 18, 9]

# turn into dictionaries (could write a function to convert)
d1 = dict([(x1[i], x1[i + 1]) for i in range(0, len(x1), 2)])
d2 = dict([(x2[i], x2[i + 1]) for i in range(0, len(x2), 2)])

print d1
# {16: 4, 153: 21, 14: 1}

之后，您可以使用此问题中的任何解决方案将它们加在一起。例如（取自第一个答案）：

import collections

def d_sum(a, b):
    d = collections.defaultdict(int, a)
    for k, v in b.items():
        d[k] += v
    return dict(d)

print d_sum(d1, d2)
# {16: 7, 153: 21, 18: 9, 14: 3}

score 5 · Accepted Answer

使用collections.Counter：

import itertools
import collections

def grouper(n, iterable, fillvalue=None):
    args = [iter(iterable)] * n
    return itertools.izip_longest(fillvalue=fillvalue, *args)

count1 = collections.Counter(dict(grouper(2, lst1)))
count2 = collections.Counter(dict(grouper(2, lst2)))
result = count1 + count2

我在这里使用了itertools库grouper配方将您的数据转换为字典，但正如其他答案向您显示的那样，有更多方法可以给特定猫剥皮。

result是一个Counter，每个 id 都指向一个总数：

Counter({153: 21, 18: 9, 16: 7, 14: 3})

Counters 是多组，将轻松跟踪每个键的计数。对于您的数据来说，这感觉像是一个更好的数据结构。例如，它们支持求和，如上所述。

score 5 · Accepted Answer

collections.Counter()是你在这里需要的：

In [21]: lis1=[14, 1, 16, 4, 153, 21]

In [22]: lis2=[14, 2, 16, 3, 18, 9]

In [23]: from collections import Counter

In [24]: dic1=Counter(dict(zip(lis1[0::2],lis1[1::2])))

In [25]: dic2=Counter(dict(zip(lis2[0::2],lis2[1::2])))

In [26]: dic1+dic2
Out[26]: Counter({153: 21, 18: 9, 16: 7, 14: 3})

或者：

In [51]: it1=iter(lis1)

In [52]: it2=iter(lis2)

In [53]: dic1=Counter(dict((next(it1),next(it1)) for _ in xrange(len(lis1)/2))) 
In [54]: dic2=Counter(dict((next(it2),next(it2)) for _ in xrange(len(lis2)/2))) 
In [55]: dic1+dic2
Out[55]: Counter({153: 21, 18: 9, 16: 7, 14: 3})

score 0 · Accepted Answer

以前的所有答案看起来都不错，但我认为 JSON blob 应该从一开始就正确形成，否则（根据我的经验）它可能会在调试等过程中导致一些严重的问题。在这种情况下，id 和 count 为字段，JSON 应该看起来像

[{"id":1, "count":10}, {"id":2, "count":10}, {"id":1, "count":5}, ...]

像这样正确形成的 JSON 更容易处理，并且可能与您输入的内容相似。

这个类有点通用，但肯定是可扩展的


from itertools import groupby
class ListOfDicts():
    def init_(self, listofD=None):
        self.list = []
        if listofD is not None:
            self.list = listofD

    def key_total(self, group_by_key, aggregate_key):
        """ Aggregate a list of dicts by a specific key, and aggregation key"""
        out_dict = {}
        for k, g in groupby(self.list, key=lambda r: r[group_by_key]):
            print k
            total=0
            for record in g:
                print "   ", record
                total += record[aggregate_key]
            out_dict[k] = total
        return out_dict


if __name__ == "__main__":
    z = ListOfDicts([ {'id':1, 'count':2, 'junk':2}, 
                   {'id':1, 'count':4, 'junk':2},
                   {'id':1, 'count':6, 'junk':2},
                   {'id':2, 'count':2, 'junk':2}, 
                   {'id':2, 'count':3, 'junk':2},
                   {'id':2, 'count':3, 'junk':2},
                   {'id':3, 'count':10, 'junk':2},
                   ])

    totals = z.key_total("id", "count")
    print totals

这使


1
    {'count': 2, 'junk': 2, 'id': 1}
    {'count': 4, 'junk': 2, 'id': 1}
    {'count': 6, 'junk': 2, 'id': 1}
2
    {'count': 2, 'junk': 2, 'id': 2}
    {'count': 3, 'junk': 2, 'id': 2}
    {'count': 3, 'junk': 2, 'id': 2}
3
    {'count': 10, 'junk': 2, 'id': 3}

{1: 12, 2: 8, 3: 10}

python - 通过两个元素的集合合并两个数组

4 回答 4

Related

Reference