pandas - 当键不存在时从熊猫数据框中获取默认值

Question

我有一个数据框多索引，其中每个键都是两个元组。目前，键中值的顺序很重要：df[(k1,k2)]与df[('k2,k1')]. 此外，有时k1,k2存在于数据框中，但k2,k1不存在。

我正在尝试平均这两个条目的某些列的值。目前，我正在这样做：

if (k1,k2) in df.index.values and not (k2,k1) in df.index.values:
    x = df[(k1,k2)]
if (k2,k1) in df.index.values and not (k1,k2) in df.index.values:
    x = df[(k2,k1)]
if (k2,k1) in df.index.values and (k1,k2) in df.index.values:
    x = (df[(k2,k1)] + df[k1,k2])/2

这太难看了......我正在寻找我们在字典中使用的 get_defualt 方法......熊猫中有这样的东西吗？

score 1 · Accepted Answer

ix索引访问和mean函数会为您处理这个问题。从中获取两个元组df.ix并对其应用 mean 函数：不存在的键作为 nan 值返回，mean 默认情况下忽略 nan 值：

In [102]: df
Out[102]: 
   (26, 22)  (10, 48)  (48, 42)  (48, 10)  (42, 48)
a       311       NaN       724       879        42

In [103]: df.ix[:,[(10, 48), (48, 10)]].mean(axis=1)
Out[103]: 
a    879
dtype: float64

In [104]: df.ix[:,[(42, 48), (48, 42)]].mean(axis=1)
Out[104]: 
a    383
dtype: float64

In [105]: df.ix[:,[(26, 22), (22, 26)]].mean(axis=1)
Out[105]: 
a    311
dtype: float64

pandas - 当键不存在时从熊猫数据框中获取默认值

1 回答 1

Related

Reference