python - 如何只获取没有扩展名的文件名？

Question

想象一下，您有这些文件路径，您想从以下位置获取不带扩展名的文件名：

                       relfilepath
0                  20210322636.pdf
12              factuur-f23622.pdf
14                ingram micro.pdf
19    upfront.nl domein - Copy.pdf
21           upfront.nl domein.pdf
Name: relfilepath, dtype: object

我想出了以下内容，但这给了我一个问题，即对于第一项，它变成了一个输出“20210322636.0”的数字。

from pathlib import Path


for i, row in dffinalselection.iterrows():
    dffinalselection['xmlfilename'][i] = Path(dffinalselection['relfilepath'][i]).stem
    dffinalselection['xmlfilename'] = dffinalselection['xmlfilename'].astype(str)

这是错误的，因为它应该是 '20210322636'

请帮忙！

score 2 · Accepted Answer

.如果列值始终是文件名/文件路径，则使用 maxsplit 参数从右开始拆分，1并在拆分后取第一个值。

>>> df['relfilepath'].str.rsplit('.', n=1).str[0]

0                  20210322636
12              factuur-f23622
14                ingram micro
19    upfront.nl domein - Copy
21           upfront.nl domein
Name: relfilepath, dtype: object

score 1 · Accepted Answer

你做对了，但是你对数据框的操作不正确。

from pathlib import Path


for i, row in dffinalselection.iterrows():
    dffinalselection['xmlfilename'][i] = Path(dffinalselection['relfilepath'][i]).stem # THIS WILL NOT RELIABLY MUTATE THE DATAFRAME
    dffinalselection['xmlfilename'] = dffinalselection['xmlfilename'].astype(str) # THIS OVERWROTE EVERYTHING

相反，只需执行以下操作：

from pathlib import Path

dffinalselection['xmlfilename'] = ''
for row in dffinalselection.itertuples():
    dffinalselection.at[row.index, 'xmlfilename']= Path(row.relfilepath).stem

或者，

dffinalselection['xmlfilename'] = dffinalselection['relfilepath'].apply(lambda value: Path(value).stem)

python - 如何只获取没有扩展名的文件名？

2 回答 2

Related

Reference