python - 检查多索引是否在两个数据帧中

Question

我有两个包含 state 和 regionname 列的数据框，我正在尝试查看 df2 是否在 df1 中，并将该列添加到 df3

df1=
+--------------+------------+------+
|    State     | RegionName | Data |
+--------------+------------+------+
| New York     | New York   | 123  |
| Jacksonville | Florida    | ABC  |
+--------------+------------+------+
df2=
+--------------+------------+------+
|    State     | RegionName | Data |
+--------------+------------+------+
| New York     | New York   | 456  |
+--------------+------------+------+

Output would be df3=
+--------------+------------+------+-------+
|    State     | RegionName | Data | IsIn2 |
+--------------+------------+------+-------+
| New York     | New York   | 123  |     1 |
| Jacksonville | Florida    | ABC  |     0 |
+--------------+------------+------+-------+

score 0 · Accepted Answer

方法一

DataFrame.stack和Series.isin

cols=['State', 'RegionName']
df1['IsIn2'] = df1[cols].stack().isin(df2[cols].stack()).all(level=0).astype(int)
print(df1)

方法二

使用DataFrame.mergewithindicator然后用both1 和 other 替换 0 使用Series.eqand Series.astype：

df3 = (df1.merge(df2[cols],on=cols, how='left',indicator='IsIn2')
          .assign(IsIn2=lambda x: x['IsIn2'].eq('both').astype(int)))
print(df3)

输出

          State RegionName Data  IsIn2
0      New York   New York  123      1
1  Jacksonville    Florida  ABC      0

score 0 · Accepted Answer

让我们做

df1['IsIn2']=df1[['State','RegionName']].apply(tuple, axis=1).\
                  isin(df2[['State','RegionName']].apply(tuple, axis=1)).\
                  astype(int)

python - 检查多索引是否在两个数据帧中

2 回答 2

Related

Reference