0

读取和附加 excel 文件以创建 DataFrame:

import pandas as pd
import os

folder = r'C:\mypathtodocuments'
files = os.listdir(folder)

df = pd.DataFrame()

for file in files:
    if file.endswith('.xlsx'):
        df = df.append(pd.read_excel(os.path.join(folder,file)))

#Drop extra columns from wrong data
df1 = df[['FIRST_NM', 'LAST_NM', 'CITY_AD']]

CITY_AD专栏预览:

>>> df1["CITY_AD"]

0      EL PASO
1      HOUSTON
2      HOUSTON
3      CONROE
4      MCKINNEY
5      MCKINNEY
6      KATY
7      TOMBALL
8      TOMBALL
9      SPRING
10     SPRING

使用函数过滤 DataFrame.isin()以仅包含城市HOUSTONCONROE

df1[df1["CITY_AD"].isin(["HOUSTON","CONROE"])]

这将返回一个空集......我怎样才能让它正确过滤?

4

1 回答 1

1

试试这个:

df1["CITY_AD"] = df1["CITY_AD"].str.strip()
df1[df1["CITY_AD"].isin(["HOUSTON","CONROE"])]
于 2021-11-15T21:24:44.153 回答