我正在尝试合并两个没有索引的熊猫数据框:
In [127]: df1
Out[127]:
value1 date id value2 group
0 -0.2284 2012-04-01 a -0.067469 group d
1 -0.4875 2012-04-01 b -0.021274 group d
2 0.1139 2012-04-01 c -0.015978 group d
3 0.3191 2012-04-01 d 0.022634 group d
4 -0.0077 2012-04-01 e 0.000000 group d
In [128]: df2
Out[128]:
date id value2 group
23044 2012-04-01 a -0.06701001 group c
23045 2012-04-01 b -0.02128 group c
23046 2012-04-01 c 0 group c
23047 2012-04-01 d 0 group c
23048 2012-04-01 e 0 group c
In [129]: pd.merge(df1, df2, how = 'outer', on = ['date', 'id', 'value2', 'group'])
Out[129]:
value1 date id value2 group
0 -0.2284 2012-04-01 a -0.067469 group d
1 -0.4875 2012-04-01 b -0.021274 group d
2 0.1139 2012-04-01 c -0.015978 group d
3 0.3191 2012-04-01 d 0.022634 group d
4 -0.0077 2012-04-01 e 0.000000 group d
5 NaN 2012-04-01 a -0.067010 group c
6 NaN 2012-04-01 b -0.021280 group c
7 NaN 2012-04-01 c 0.000000 group c
8 NaN 2012-04-01 d 0.000000 group c
9 NaN 2012-04-01 e 0.000000 group c
这几乎是所需的输出,除了我希望根据日期和 id 由组 d 中的 value1 填充 c 组的 value1 的 NaN。实现这一目标的正确方法是什么?