对于使用记录,我个人喜欢numpy.recarray
.
In [3]: import numpy as np
In [4]: fields = data.keys()
In [8]: recs = zip(*[ lst for k, lst in data.iteritems() ])
In [9]: recs[0]
Out[9]: ('info1', 1, 1)
In [10]: recs[1]
Out[10]: ('info2', 1, 2)
In [21]: ra = np.rec.fromrecords(recs, names = fields )
In [17]: ra
rec.array([('info1', 1, 1), ('info2', 1, 2), ('info3', 2, 3), ('info4', 2, 4),
('info5', 1, 5), ('info6', 2, 6), ('info7', 1, 7), ('info8', 2, 8)],
dtype=[('info', 'S5'), ('id', '<i8'), ('number', '<i8')])
In [23]: ra[ra.id == 2]
rec.array([('info3', 2, 3), ('info4', 2, 4), ('info6', 2, 6), ('info8', 2, 8)],
dtype=[('info', 'S5'), ('id', '<i8'), ('number', '<i8')])
In [24]: ra[ra.id == 2].number
Out[24]: array([3, 4, 6, 8])
In [25]: ra[ra.id == 2][0]
Out[25]: ('info3', 2, 3)
In [26]: ra[ra.id == 2][0].number
Out[26]: 3
如果要在字典中按 id 对记录进行分组,请执行以下操作:
{ id: ra[ra.id == id] for id in set(ra.id) }