读取和附加 excel 文件以创建 DataFrame:
import pandas as pd
import os
folder = r'C:\mypathtodocuments'
files = os.listdir(folder)
df = pd.DataFrame()
for file in files:
if file.endswith('.xlsx'):
df = df.append(pd.read_excel(os.path.join(folder,file)))
#Drop extra columns from wrong data
df1 = df[['FIRST_NM', 'LAST_NM', 'CITY_AD']]
CITY_AD
专栏预览:
>>> df1["CITY_AD"]
0 EL PASO
1 HOUSTON
2 HOUSTON
3 CONROE
4 MCKINNEY
5 MCKINNEY
6 KATY
7 TOMBALL
8 TOMBALL
9 SPRING
10 SPRING
使用函数过滤 DataFrame.isin()
以仅包含城市HOUSTON
和CONROE
:
df1[df1["CITY_AD"].isin(["HOUSTON","CONROE"])]
这将返回一个空集......我怎样才能让它正确过滤?