python - 根据前两个字母计数文件

Question

如果有人可以帮助我，我会很高兴。我对 Python 几乎一无所知，所以请原谅我的幼稚。我花了两天时间阅读这个网站，试图超越我所处的位置。

我写了这段代码（大部分都在这个网站上看到了）：

    import os
    path = '/the/path/to/the/I want/to/count'
    file_count = sum((len(f) for _, _,f in os.walk(path)))
    print "Number of files: ",file_count

我得到了我的文件数，但这需要一段时间。有更快的代码吗？它进入了我认为的子目录，因为文件数高于我的预期。

我的最终目标是根据每个文件的前两个字母计算文件数。IE。阿拉巴马州，亚利桑那州，亚利桑那州。我可以举个例子说明我必须为此添加什么吗？

score 4 · Accepted Answer

是的，os.walk()遍历子目录。

如果您需要按前两个字母分组的计数，我会collections.Counter()为此使用一个类：

import os
from collections import Counter

path = '/the/path/to/the/I want/to/count'
counts = Counter(fname[:2] for _, _, files in os.walk(path) for fname in files)
for initials, count in counts.most_common():
    print '{}: {:>20}'.format(initials, count)

这将遍历子目录，并收集按遇到的每个文件名的前两个字符分组的计数，然后打印按最常见到最不常见排序的那些计数。

如果不想遍历子目录，请os.listdir()改用；它只返回给定目录中的名称（包括文件名和目录名）。然后，您可以使用os.path.isfile()过滤掉那些只是文件名的名称：

counts = Counter(fname[:2] for fname in os.listdir(path) if os.path.isfile(os.path.join(path, fname)))

如果您正在查找具有特定扩展名的文件，请查找该扩展名而不是isfile()测试；大概没有子目录会使用相同的扩展名：

counts = Counter(fname[:2] for fname in os.listdir(path) if fname.endswith('.pdf'))

score 1 · Accepted Answer

你可以试试

len(glob.glob('/the/path/to/the/I want/to/count/AL*'))
len(glob.glob('/the/path/to/the/I want/to/count/AR*'))

等等

python - 根据前两个字母计数文件

2 回答 2

Related

Reference