根据我对标准/流行库的研究,截至2020 年尚未实现/xlsx
但xls
您可以为xlsb
. 无论哪种方式,这些解决方案都应该为您带来巨大的性能改进。对于xls
, xlsx
, xlsb
.
下面以 ~10Mb文件为基准xlsx
。xlsb
xlsx, xls
from openpyxl import load_workbook
def get_sheetnames_xlsx(filepath):
wb = load_workbook(filepath, read_only=True, keep_links=False)
return wb.sheetnames
基准测试: ~ 14 倍速度提升
# get_sheetnames_xlsx vs pd.read_excel
225 ms ± 6.21 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
3.25 s ± 140 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
xlsb
from pyxlsb import open_workbook
def get_sheetnames_xlsb(filepath):
with open_workbook(filepath) as wb:
return wb.sheets
基准测试: ~ 56 倍速度提升
# get_sheetnames_xlsb vs pd.read_excel
96.4 ms ± 1.61 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
5.36 s ± 162 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
笔记: