python - 如何从文件夹中选择带有编号扩展名的文件？

Question

我正在尝试为一个项目构建我自己的数据集。因此，我需要选择已从另一个程序导出并带有编号扩展名的文件：

exported_file_1_aaa.001
exported_file_2_aaa.002
exported_file_3_aaa.003
...
exported_file_5_zzz.925
...and so on.

我知道如何从文件夹中选择具有特定扩展名的文件，例如“.txt”，并将其附加到列表或字典中。有没有办法用'.nnn'解决这个问题

ext = '.nnn'
all_files = [i for i in os.listdir(dir) if os.path.splitext(i)[1] == ext]
for f in all_files:
    ...

score 2 · Accepted Answer

您可以混合使用 shell globbing ( glob) 和 regex ( re) 的功能。

有了glob您可以获得以数字结尾的文件，以便我们获得有限数量的文件re以进行最终检查：

glob.iglob('exported_file_*.*[0-9]')

然后我们可以用 Regex 模式精确匹配文件：

\.\d+$

这将匹配在 last 之后以数字结尾的文件名.。

放在一起：

import glob
import re
[file for file in glob.iglob('exported_file_*.*[0-9]') if re.search(r'\.\d+$', file)]

Shell globbing 不如灵活re，否则我们可以glob单独完成。

此外，如果您确定所有文件都以一定数量的数字结尾，那么glob单独使用例如在 last 之后以 3 位数字结尾的文件.：

glob.iglob('exported_file_*.[0-9][0-9][0-9]')

score 0 · Accepted Answer

如果不关心扩展的长度，可以使用isdigit方法：

all_files = [i for i in os.listdir(dir) if os.path.splitext(i)[1].isdigit()]
for f in all_files: 
    ....

score 0 · Accepted Answer

您可以使用该glob模块。

import glob

my_dir = "mydir"

all_files = [fn for fn in glob.glob(f"{my_dir}/*.[0-9][0-9][0-9]")]

3 回答 3