-1

我有两个包含 state 和 regionname 列的数据框,我正在尝试查看 df2 是否在 df1 中,并将该列添加到 df3

df1=
+--------------+------------+------+
|    State     | RegionName | Data |
+--------------+------------+------+
| New York     | New York   | 123  |
| Jacksonville | Florida    | ABC  |
+--------------+------------+------+
df2=
+--------------+------------+------+
|    State     | RegionName | Data |
+--------------+------------+------+
| New York     | New York   | 456  |
+--------------+------------+------+

Output would be df3=
+--------------+------------+------+-------+
|    State     | RegionName | Data | IsIn2 |
+--------------+------------+------+-------+
| New York     | New York   | 123  |     1 |
| Jacksonville | Florida    | ABC  |     0 |
+--------------+------------+------+-------+
4

2 回答 2

0

方法一

DataFrame.stackSeries.isin

cols=['State', 'RegionName']
df1['IsIn2'] = df1[cols].stack().isin(df2[cols].stack()).all(level=0).astype(int)
print(df1)

方法二

使用DataFrame.mergewithindicator然后用both1 和 other 替换 0 使用Series.eqand Series.astype

df3 = (df1.merge(df2[cols],on=cols, how='left',indicator='IsIn2')
          .assign(IsIn2=lambda x: x['IsIn2'].eq('both').astype(int)))
print(df3)

输出

          State RegionName Data  IsIn2
0      New York   New York  123      1
1  Jacksonville    Florida  ABC      0
于 2020-03-23T00:23:26.780 回答
0

让我们做

df1['IsIn2']=df1[['State','RegionName']].apply(tuple, axis=1).\
                  isin(df2[['State','RegionName']].apply(tuple, axis=1)).\
                  astype(int)
于 2020-03-22T23:55:44.290 回答