我在 file1 中创建了一个 contigs 及其长度的字典。我也有 file2,它是表格格式的爆炸输出,其中包含 contig 对齐(但不是全部)和一些附加信息,如匹配开始和结束的位置等。为了计算查询和主题覆盖率,我需要关联这些长度从 file1 到 file2 中的长度。怎么做?谢谢
问问题
77 次
2 回答
1
假设 file1 是:
contig1 134
contig2 354
contig3 345
你的脚本看起来像
import re
contigDict={}
with open('file1') as c1:
text=c1.readlines()
for line in text:
key,value = line.split()
contigDict[key]=value
with open('file2') as c2:
scrambled_text=c2.read()
contigs = re.findall(r'contig\d+',scrambled_text)
output = {}
for contig in contigs:
output[contig]=contigDict[contig]
with open('file3',w) as w:
for key in output.keys():
w.write(key+'\t'+output[key]+'\n')
于 2014-01-22T18:03:59.903 回答
0
这是有效的
import re
r=open('result.txt','w')
subjectDict={}
with open('file1.txt') as c1:
text=c1.readlines()
for line in text:
key,value = line.split()
subjectDict[key]=value
with open('file2.txt') as c2:
lines=c2.readlines()
for line in lines:
new_list=re.split(r'\t+',line)
s_name=new_list[0]
subjects = re.findall(r'contig\d+',s_name)
output = {}
for subject in subjects:
output[subject]=subjectDict[subject]
r.writelines(subjectDict[subject]+'\n')
于 2014-01-23T16:30:02.633 回答