2

I have a question similar to the one discussed here Concatenating dictionaries of numpy arrays (avoiding manual loops if possible)

I am looking for a way to concatenate the values in two python dictionaries that contain numpy arrays of arbitrary size whilst avoiding having to manually loop over the dictionary keys. For example:

import numpy as np

# Create first dictionary
n1 = 3
s = np.random.randint(1,101,n1)
n2 = 2
r = np.random.rand(n2)
d = {"r":r,"s":s}
print "d = ",d

# Create second dictionary
n3 = 1
s = np.random.randint(1,101,n3)
n4 = 3
r = np.random.rand(n4)
d2 = {"r":r,"s":s}
print "d2 = ",d2

# Some operation to combine the two dictionaries...
d = SomeOperation(d,d2)

# Updated dictionary
print "d3 = ",d

to give the output

>> d =  {'s': array([75, 25, 88]), 'r': array([ 0.1021227 ,  0.99454874])}
>> d2 =  {'s': array([78]), 'r': array([ 0.27610587,  0.57037473, 0.59876391])}
>> d3 =  {'s': array([75, 25, 88, 78]), 'r': array([ 0.1021227 ,  0.99454874, 0.27610587,  0.57037473, 0.59876391])}

i.e. so that if the key already exists, the numpy array stored under that key is appended to.

The solution proposed in the previous discussion using the package pandas does not work as it requires arrays having the same length (n1=n2 and n3=n4).

Does anybody know the best way to do this, whilst minimising the use of slow, manual for loops? (I would like to avoid loops because the dictionaries I would like to combine could have hundreds of keys).

Thanks (also to "Aim" for formulating a very clear question)!

4

1 回答 1

1

一种方法是使用系列字典(即值是系列而不是数组):

In [11]: d2
Out[11]: {'r': array([ 0.3536318 ,  0.29363604,  0.91307454]), 's': array([46])}

In [12]: d2 = {name: pd.Series(arr) for name, arr in d2.iteritems()}

In [13]: d2
Out[13]:
{'r': 0    0.353632
1    0.293636
2    0.913075
dtype: float64,
 's': 0    46
dtype: int64}

这样你就可以将它传递给 DataFrame 构造函数:

In [14]: pd.DataFrame(d2)
Out[14]:
          r   s
0  0.353632  46
1  0.293636 NaN
2  0.913075 NaN
于 2013-05-03T14:08:17.660 回答