2

I have array like:

foo foot oot
foo foot oot
bar bart art
bar bart art

I have dict

{ foo : 1, bar :2, foot:34, bart:54, oot:123}

Looking to map or apply on set for this output:

1 34 123
1 34 123
2 54 NaN
2 54 NaN

Notice: one value is missing. I was thinking of slicing each column, then doing a list comprehesion, But this feels wrong.

4

2 回答 2

2

使用列表理解:

>>> lis = [['foo', 'foot', 'oot'],
['foo', 'foot', 'oot'],
['bar', 'bart', 'art'],
['bar', 'bart', 'art']]
>>> dic = { 'foo' : 1, 'bar' :2, 'foot':34, 'bart':54, 'oot':123}
>>> nan = float('nan')
>>> [[dic.get(y,nan) for y in x] for x in lis]
[[1, 34, 123], [1, 34, 123], [2, 54, nan], [2, 54, nan]]

dict.get(key, default_value): 如果找到,则返回与键相关的值,key否则返回default_value

我们不能NaN直接在 python 中使用,这就是为什么使用float('nan').

于 2013-07-01T03:00:49.020 回答
1

对于大型数组,以下仅 numpy 的代码可能会执行得更好:

arr = np.array([['foo', 'foot', 'oot'],
                ['foo', 'foot', 'oot'],
                ['bar', 'bart', 'art'],
                ['bar', 'bart', 'art']])

dict_ = {'foo' : 1, 'bar' : 2, 'foot' : 34, 'bart' : 54, 'oot' : 123}

arr_flat = arr.ravel()

keys = np.array(dict_.keys())
vals = np.array(dict_.values())
sort_idx = np.argsort(keys)
keys = keys[sort_idx]
vals = vals[sort_idx]
vals = np.concatenate((vals, [np.nan]))

unique, indices = np.unique(arr_flat, return_inverse=True)
locs = np.searchsorted(keys, unique, side='left')
no_match = unique != keys[locs]
locs[no_match] = len(keys)

new_arr = np.take(vals, np.take(locs, indices)).reshape(arr.shape)
# Same as new_arr = vals[locs[indices]].reshape(arr.shape)

>>> arr
array([['foo', 'foot', 'oot'],
       ['foo', 'foot', 'oot'],
       ['bar', 'bart', 'art'],
       ['bar', 'bart', 'art']], 
      dtype='|S4')
>>> new_arr
array([[   1.,   34.,  123.],
       [   1.,   34.,  123.],
       [   2.,   54.,   nan],
       [   2.,   54.,   nan]])
于 2013-07-01T15:21:14.293 回答