0

我有以下功能。该程序查看每个文件并将所有 4 个文件中出现的行打印到一个新文件中。我试过file1.close()了,但我收到关于关闭集合的错误?我想我可以使用一个with语句,但不知道该怎么做,我对编程很陌生。

def secretome():
    file1 = set(line.strip() for line in open(path + "goodlistSigP.txt"))
    file2 = set(line.strip() for line in open(path + "tmhmmGoodlist.txt"))
    file3 = set(line.strip() for line in open(path + "targetpGoodlist.txt"))
    file4 = set(line.strip() for line in open(path + "wolfPsortGoodlist.txt"))
    newfile = open(path + "secretome_pass.txt", "w")
    for line in file1 & file2 & file3 & file4:
        if line:
            newfile.write(line + '\n')
    newfile.close()
4

4 回答 4

4

我建议通过将您的集合生成提取到一个函数中来消除重复:

def set_from_file(path):
    with open(path) as file:
        return set(lines.strip() for line in file)

def secretome():
    files = ["goodlistSigP.txt", "tmhmmGoodlist.txt", "targetpGoodlist.txt", "wolfPsortGoodlist.txt"]
    data = [set_from_file(os.path.join(path, file)) for file in files]
    with open(path + "secretome_pass.text", "w") as newfile:
        newfile.writelines(line + "/n" for line in set.union(*data) if line)

请注意,您正在代码中进行交集,但您谈论想要一个联合,所以我union()在这里使用。还有几个列表推导/生成器表达式

于 2012-12-20T14:37:47.163 回答
1

这似乎是一种非常复杂的方法。我建议像我在这里给出的例子。

import fileinput
files = ['file1.txt','file2.txt','file3.txt','file4.txt']  
output = open('output.txt','w')

for file in files:
    for line in fileinput.input([file]):
        output.write(line)
    output.write('\n')

output.close()

此代码创建一个包含文件的列表(用所需的文件路径替换名称),创建一个文件来存储每个文件的输出,然后使用 fileinput 模块简单地遍历它们以逐行遍历每个文件,打印每个行到输出文件。'output.write('\n')' 确保下一个文件行的打印从输出文件的新行开始。

于 2012-12-20T14:51:18.320 回答
1

采取与我原来的方向完全不同的方向(Lattyware 击败了我):

你可以定义一个函数:

def file_lines(fname):
    with open(fname) as f:
         for line in f:
             yield line

现在您可以使用它itertools.chain来迭代您的文件:

import itertools
def set_from_file(path):
    filenames = ("name1","name2","name3",...)  #your input files go here
    lines = itertools.chain.from_iterable(itertools.imap(file_lines,filenames))
    #lines is an iterable object.  
    #At this point, virtually none of your system's resources have been consumed
    with open("output",'w') as fout:
         #Now we only need enough memory to store the non-duplicate lines :)
         fout.writelines(set( line.strip()+'\n' for line in lines) )
于 2012-12-20T14:38:29.127 回答
0

您可以将其放入生成器中:

def closingfilelines(*a):
    with open(*a) as f:
        for line in f:
            yield f

并在您当前使用的地方使用它open()

在生成器运行时,文件保持打开状态,如果生成器耗尽,则将其关闭。

如果生成器对象是.close()d 或已删除,也会发生同样的情况 - 在这种情况下,生成器会遇到GeneratorExit异常,这也会导致with子句被保留。

于 2012-12-20T14:38:16.793 回答