我正在将以下 excel 表读入数据框中。
我想按产品将它分成三个数据框。表格将始终由其间的单个空白列分隔,但每个表格可以有不同数量的列。
根据评论中介绍的文章,可以如下处理。
import pandas as pd
#### Read excel file to dataframe
df = pd.read_excel('test.xlsx', index_col=None, header=None)
#### Find empty column and listed
empcols = [col for col in df.columns if df[col].isnull().all()]
df.fillna('', inplace=True)
#### Split into consecutive columns of valid data
allcols = list(range(len(df.columns)))
start = 0
colslist = []
for sepcol in empcols:
colslist.append(allcols[start:sepcol])
start = sepcol+1
colslist.append(allcols[start:])
#### Extract consecutive columns of valid data and store them in a dictionary
dfdic = {}
for i in range(len(colslist)):
wkdf = df.iloc[:, colslist[i]]
title = ''.join(wkdf.iloc[0].tolist())
wkcols = wkdf.iloc[1].tolist()
wkdf.drop(wkdf.index[[0,1]], inplace=True)
wkdf.columns = wkcols
dfdic[title] = wkdf.reset_index(drop=True)
#### Display each DataFrame stored in the dictionary
dfkeys = dfdic.keys()
for k in dfkeys:
print(k)
print(dfdic[k])
print()