2

我是python新手(使用2.7),我正在尝试获取对齐序列的fasta文件并删除句点(。)和破折号(-)。我正在尝试编写一个循环,以便 python 遍历每一行并用任何内容替换句点和破折号。这是我得到的脚本(当我运行它时,它会删除句点和破折号,但会留下空格):

InFileName = 'myfile.fasta'
InFile = open(InFileName, 'r')

OutFileName = 'myfile_nodots.fasta'
OutFile = open(OutFileName, 'w')

for Line in InFile:

     Line=Line.replace('.', "")

     Line=Line.replace('-', "")

     Outfile.write(Line) 

InFile.close()
OutFile.close()

任何建议将不胜感激!仁

4

5 回答 5

3

您可以整理代码以with确保文件关闭,并在 2.7 中使用第二个参数str.translate来指定要删除的字符,因此您的代码可以是:

with open('myfile.fasta') as fin, open('myfile_nodots.fasta', 'w') as fout:
    for line in fin:
        fout.write(line.translate(None, '-.'))
于 2013-03-01T20:20:08.230 回答
2

您可以稍微简化一下代码:

import re
infilename = 'myfile.fasta'
outfilename = 'myfile_nodots.fasta'
regex = re.compile("[.-]+")    

with open(infilename, 'r') as infile, open(outfilename, 'w') as outfile:
    for line in infile:
        outfile.write(regex.sub("", line))

如果您还想删除点或破折号后面的空格,请使用不同的正则表达式:

regex = re.compile("[.-]+ *")    
于 2013-03-01T20:17:28.320 回答
1

Use fileinput and translate for quick in-place editing:

import fileinput

for line in fileinput.input("test.txt", inplace=1):
    sys.stdout.write(line.translate(None, '-.'))

And before you ask: yes, it writes to the file, not to the console :)

于 2013-03-01T21:36:53.587 回答
0

Assuming that fasta headers may contain dashes or dots as well(i.e. isoforms), which is quite common,

with open('myfile.fasta') as fin:
    with open('myfile_nodots.fasta', 'w') as fout:
        for line in fin:
            if line.startswith('>'):
                fout.write(line)
            else:
                fout.write(line.translate(None, '-.'))
于 2014-05-28T11:52:49.910 回答
-1

你试过Outfile.write(Line.strip())吗?

于 2013-03-01T20:15:29.333 回答