我正在尝试在 pandas 中加入两个数据框以具有以下行为:我想加入指定的列,但有它所以冗余列不会添加到数据框中。这类似于combine_first
exceptcombine_first
似乎不采用索引列可选参数。例子:
# combine df1 and df2 based on "id" column
df1 = pandas.merge(df2, how="outer", on=["id"])
上面的问题是除“id”之外的 df1/df2 共有的列将被添加两次(带_x,_y
前缀)到 df1。我该怎么做:
# Do outer join from df2 to df1, matching items by "id" but not adding
# columns that are redundant (df1 takes precedence if the values disagree)
df1.combine_first(df2, on=["id"])
如何才能做到这一点?