0

我有 300 个 XML 文件,每个文件中都有一个路径(请参阅代码),我想用 Python 制作这个路径的列表(.CSV)。

 <da:AdminData>
    <da:Datax />
    <da:DataID>223</da:DataID>
    <da:Date>2013-08-19</da:Date>
    <da:Time>13:27:25</da:Time>
    <da:Modification>2013-08-19</da:Modification>
    <da:ModificationTime>13:27:25</da:ModificationTime>
    **<da:Path>D:\08\06\xxx-aaa_20130806_111339.dat</da:Path>**
    <da:ID>xxx-5225-fff</da:ID>

我写了以下代码,但不适用于子目录

import os, glob, re, time, shutil

xmlpath = r'D:'

outfilename = "result.csv"


list = glob.glob(os.path.join(xmlpath,'*.xml'))




output = ""

for file in list :

    fh = open(file)
    text = fh.read()
    pattern = "<da:Path>(.*)</da:Path>"
    pattern = re.compile(pattern);
    a = pattern.search(text)

    if  a:
        output += '\n' + a.group(1)




logfile = open(outfile, "w")
logfile.write(output)
logfile.close()
4

1 回答 1

0

要递归地 glob,最好使用os.walk和的组合fnmatch.fnmatch。例子:

import os
import fnmatch


def recursive_glob(rootdir, pattern):
    matching_files = []
    for d, _, fnames in os.walk(rootdir):
        matching_files.extend(
            os.path.join(d, fname) for fname in fnames
            if fnmatch.fnmatch(fname, pattern)
        )
    return matching_files


xmlfiles = recursive_glob(r"D:\", "*.xml")
于 2013-10-17T11:43:37.220 回答