2

1.csv

     cut  price  depth  carat  table
0   Good    327   57.9   0.23   65.0
1   Good    335   63.3   0.31   58.0
2 Very Good 336   62.8   0.24   57.0
3 Very Good 336   62.3   0.24   57.0
4 Very Good 337   61.9   0.26   55.0
5 Premium   326   59.8   0.21   61.0
6  Premium  334   62.4   0.29   58.0
7   Good    400   64.0   0.30   55.0

2.csv

     cut  price  depth  carat  table
0   Good    327   57.9   0.23   65.0
1   Good    335   63.3   0.31   58.0
2 Very Good 336   62.8   0.24   57.0
3 Very Good 336   62.3   0.24   57.0
4 Very Good 337   61.9   0.26   50.0
5 Premium   326   59.8   0.21   61.0
6  Premium  334   60.4   0.29   58.0
7   Good    399   64.0   0.30   55.0

2.csv 中只有 4,6,7 行被更改

我想得到

像这样输出

     cut  price  depth  carat  table
4 Very Good 337   61.9   0.26   50.0
6  Premium  334   60.4   0.29   58.0
7   Good    399   64.0   0.30   55.0

任何人都可以分享您的经验任何形式的帮助都可以

import pandas as pd
f1 = pd.read_csv('1.csv')
f2 = pd.read_csv('2.csv')
columns_list = ['cut', 'price', 'depth', 'carat', 'table']

new_df= f2[~f2.price.isin(f1.price)]
print(new_df)

这是我编写的示例代码,它工作正常,但我需要使用

f2[~f2.price.isin(f1.price)]

在一个循环中获取该“价格”空间上的每个列名称,这也将返回该值。我以这样的正常方式尝试过

for i in columns_list:
price = f2[~f2.i.isin(f1.i)]
print(price)

但是 pandas 命令不能像这样使用它会返回一个错误

AttributeError: 'DataFrame' object has no attribute 'i'

感谢阅读,希望你能理解

4

1 回答 1

2

IIUC, DataFrame.mergeindicator = True

f2_filtered = (f2.merge(f1, how='outer', indicator=True)
                 .query('_merge == "left_only"')
                 .drop(columns = '_merge'))
print(f2_filtered)

输出

         cut  price  depth  carat  table
4  Very_Good    337   61.9   0.26   50.0
6    Premium    334   60.4   0.29   58.0
7       Good    399   64.0   0.30   55.0
于 2020-04-01T16:35:07.473 回答