python - 需要'if os.havefiles'之类的函数在python中进行子文件夹搜索

Question

我需要通过所有子文件夹从我的父路径 (tutu) os.walk。对于每一个，每个最深的子文件夹都有我需要用我的代码处理的文件。对于所有包含文件的最深文件夹，文件“布局”是相同的：一个文件 *.adf.txt、一个文件 *.idf.txt、一个文件 *.sdrf.txt 和一个或多个文件 *.dat。，如图所示。在此处输入图像描述我的问题是我不知道如何使用 os 模块从我的父文件夹依次迭代到所有子文件夹。我需要一个函数，对于 os.walk 中的当前子文件夹，如果该子文件夹为空，则继续到该子文件夹内的子子文件夹（如果存在）。如果存在，则验证该文件布局是否存在（这没问题...），如果存在，则应用代码（也没有问题）。如果没有，并且该文件夹没有更多子文件夹，请返回父文件夹并 os.walk 到下一个子文件夹，并将所有子文件夹都放入我的父文件夹 (tutu)。要恢复，我需要一些类似下面的函数（用 python/虚构代码混合编写）：

for all folders in tutu:
    if os.havefiles in os.walk(current_path):#the 'havefiles' don´t exist, i think...
        for filename in os.walk(current_path):
            if 'adf' in filename:
                etc...
                #my code
    elif:
        while true:
            go deep
    else:
        os.chdir(parent_folder)

您认为这是调用我的代码来完成这项工作的最佳定义吗？

这是我尝试使用的代码，当然没有成功：

import csv
import os
import fnmatch

abs_path=os.path.abspath('.')
for dirname, subdirs, filenames in os.walk('.'):
    # print path to all subdirectories first.
    for subdirname in subdirs:
        print os.path.join(dirname, subdirname), 'os.path.join(dirname, subdirname)'
        current_path= os.path.join(dirname, subdirname)
        os.chdir(current_path)
        for filename in os.walk(current_path):
            print filename, 'f in os.walk'
            if os.path.isdir(filename)==True:
                break
            elif os.path.isfile(filename)==True:
                print filename, 'file'
        #code here

提前致谢...

score 0 · Accepted Answer

我需要一个函数，对于 os.walk 中的当前子文件夹，如果该子文件夹为空，则继续到该子文件夹内的子子文件夹（如果存在）。

这没有任何意义。如果文件夹为空，则它没有任何子文件夹。

也许您的意思是，如果它没有常规文件，则递归到其子文件夹，但如果有，不要递归，而是检查布局？

为此，您只需要这样的东西：

for dirname, subdirs, filenames in os.walk('.'):
    if filenames:
        # can't use os.path.splitext, because that will give us .txt instead of .adf.txt
        extensions = collections.Counter(filename.partition('.')[-1] 
                                         for filename in filenames)
        if (extensions['.adf.txt'] == 1 and extensions['.idf.txt'] == 1 and
            extensions['.sdrf.txt'] == 1 and extensions['.dat'] >= 1 and
            len(extensions) == 4):
            # got a match, do what you want

        # Whether this is a match or not, prune the walk.
        del subdirs[:]

我在这里假设您只想查找具有完全指定文件的目录，而不是其他目录。要删除最后一个限制，只需删除该len(extensions) == 4部分。

无需显式迭代或任何东西，或从内部subdirs递归调用。的全部意义在于它已经递归地访问了它找到的每个子目录，除非你明确告诉它不要这样做（通过修剪它给你的列表）。os.walkos.walkwalk

score 0 · Accepted Answer

os.walk 会自动递归地“挖掘”，所以你不需要自己递归树。

我认为这应该是您的代码的基本形式：

import csv
import os
import fnmatch

directoriesToMatch = [list here...]
filenamesToMatch = [list here...]

abs_path=os.path.abspath('.')
for dirname, subdirs, filenames in os.walk('.'):
    if len(set(directoriesToMatch).difference(subdirs))==0:     # all dirs are there
        if len(set(filenamesToMatch).difference(filenames))==0: # all files are there
            if <any other filename/directory checking code>:
                # processing code here ...

根据python文档，如果您出于某种原因不想继续递归，只需从子目录中删除条目：http: //docs.python.org/2/library/os.html

如果您想检查是否没有找到要处理的文件的子目录，您还可以将 dirs 检查更改为：

    if len(subdirs)==0: # check that this is an empty directory

我不确定我是否完全理解这个问题，所以我希望这会有所帮助！

编辑：

好的，所以如果您需要检查没有文件，只需使用：

    if len(filenames)==0:

但正如我上面所说，最好只查找特定文件而不是检查空目录。

python - 需要'if os.havefiles'之类的函数在python中进行子文件夹搜索

2 回答 2

Related

Reference