-2

我有一个巨大的文本文件(models.txt)并包含如下所示的行:

Model 1
text
text
text
text
END

Model 2
text
text
text
text
END

Model 3
text
text
text
text
END

我想写一个函数,以便它可以以“Model 1”、“Model 2”和“Model 3”为起点,以“END”为终点,并写出放置文件 model_1.txt、model_2.txt 和相应块的 Model_3.txt

因为我不太懂编程所以我写了这个

a = open('C:/Users/Zebrafish/Desktop/AHR_human_modeling/human/edited/1AHH.B99990013.pdb','r')
lines = a.readlines()

x = 1

for line in lines:
    if 'END' in line:
        PDB_file = open('C:/Users/Zebrafish/Desktop/AHR_human_modeling/human/edited/model_1.pdb','w')
        PDB_file.write(line)
        PDB_file.close()
4

2 回答 2

4
from itertools import groupby
with open('infile') as f:
    groups = groupby(f, key=str.isspace)
    for k, lines in groups:
        if k:
            continue
        fname = next(lines).strip().lower().replace(' ', '_')+'.txt'
        with open(fname, 'w') as outf:
            outf.writelines(lines)
于 2013-10-30T06:03:26.743 回答
0

如果您的文件适合内存,那么您可以使用正则表达式来拆分文件,然后遍历匹配项:

with open('models.txt') as handle:
    models = re.findall("Model.*?END", handle.read(), re.MULTILINE|re.DOTALL)
    for i, model in enumerate(models):
        with open('model_%s.txt' % i) as output:
            output.write(model)
于 2013-10-30T06:13:12.043 回答