4

我有这段代码可以计算目录中前两个字母相同的文件。我想修改它,以便它在修改日期之前完成。因此,如果有 10 个以 . 开头的PR文件和 10 个FM以 .

17
FM 5
PR 5
18
FM 5
PR 5
import os
from collections import Counter

path = '/My/path/to/the/directory/test'

counts = Counter(fname[:2] for fname in os.listdir(path) if
                      os.path.isfile(os.path.join(path, fname)) 
                  and 'blue' in fname 
                  or 'green' in fname 
                  or 'yellow' in fname 
                  or 'red' in fname 
                  or 'purple' in fname)

for initials, count in counts.most_common():
    print '{}: {:>20}'.format(initials,count)

我可以打印出修改日期,但不能与计数一起打印。我将不胜感激任何帮助。我最初想使用调度程序(有一个很好的例子可以效仿),但在使用它并让它触发时陷入了困境。由于我一直在阅读有关正则表达式以及如何在文件名中提取月份中的日期的内容,但对于如何将它们全部连接起来,我感到很困惑。

4

2 回答 2

1

您可以使用 groupby 来组织文件:

首先,您需要一个将文件映射到其 mtime 的函数,然后获取按该值排序的文件列表:

from collections import Counter
from itertools import groupby
import os
import datetime

def find_mod_date(basedir):
    return lambda filename: datetime.date.fromtimestamp(
                            os.stat(os.path.join(basedir, filename)).st_mtime)

path="/tmp"
mod_dates_in_path = find_mod_date(path)

files = [fname for fname in os.listdir(path) 
         if os.path.isfile(os.path.join(path, fname))
             and any(name in fname for name in ['red', 'blue'])]
files = sorted(files, key=mod_dates_in_path)

然后按 mtime 对文件进行分组:

grouping_by_date = groupby(files, key=mod_dates_in_path)

遍历结果并按名称前缀计数:

results = {}
for day, group in grouping_by_date:
    results[day] = Counter(name[:2] for name in group)

for day, prefix_counts in results.iteritems():
    print day
    for prefix, count in prefix_counts.iteritems():
        print "{}: {}".format(prefix, count)
于 2013-05-31T22:44:29.480 回答
1

一种方法是从文件中构建一个字典,以修改日期为键,关联的 Counter 对象与您在代码中所做的类似。为了稍微简化一些事情,我还使用了defaultdictof Counters

因此,给定一个包含这些文件和修改日期的文件夹以进行测试:

blue1       05/30/2013  06:37 PM
green1      05/30/2013  06:37 PM
green2      05/30/2013  06:37 PM
purple1     05/30/2013  06:37 PM
purple2     05/30/2013  06:37 PM
purple3     05/30/2013  06:37 PM
purple4     05/30/2013  06:37 PM
purple5     05/30/2013  06:37 PM
red1        05/31/2013  06:38 PM
red2        05/31/2013  06:38 PM
red3        05/31/2013  06:38 PM
red4        05/31/2013  06:38 PM
yellow1     05/31/2013  06:38 PM
yellow2     05/31/2013  06:38 PM
yellow3     05/31/2013  06:38 PM

这段代码:

from collections import defaultdict, Counter
from datetime import date
from operator import itemgetter
import os

COLORS = ('blue', 'green', 'yellow', 'red', 'purple')
NUM_LETTERS = 2
path = 'testdir'

date_counters = defaultdict(Counter)

for filename, filepath in ((name, os.path.join(path, name))
                                for name in os.listdir(path)):
    if (os.path.isfile(filepath) and any(color in filename for color in COLORS)):
        mod_date = date.fromtimestamp(os.stat(filepath).st_mtime)
        date_counters[mod_date].update((filename[:NUM_LETTERS],))

for mod_date in sorted(date_counters):  # sort by file group's modification date
    print mod_date.day
    for initials, count in sorted(date_counters[mod_date].iteritems(),
                                  key=itemgetter(1)):
        print initials, count

产生了这个输出:

30
bl 1
gr 2
pu 5
31
ye 3
re 4
于 2013-06-01T02:10:49.930 回答