如果你有一个名字列表。. .
query = ['link','zelda','saria','ganon','volvagia']
和文件中的行列表
data = ['>link is the first','OIGFHFH','AGIUUIIUFG','>peach is the second',
'AGFDA','AFGDSGGGH','>luigi is the third','SAGSGFFG','AFGDFGDFG',
'DSGSFGAAA','>ganon is the fourth','ADGGHHHHHH','>volvagia is the last',
'AFGDAAFGDA','ADFGAFD','ADFDFFDDFG','AHUUERR','>ness is another','ADFGGGGH',
'HHHDFDA']
您将如何查看以“>”开头的所有行,然后如果它们具有名称 name_list 之一,则包括带有“>”的行以及它后面的序列(后面的序列总是在上面)在两个单独的列表中
#example output file
name_list = ['>link is the first','>ganon is the fourth','>volvagia is the last']
seq_list = ['OIGFHFHAGIUUIIUFG','ADGGHHHHHH','AFGDAAFGDAADFGAFDADFDFFDDFGAHUUERR']
我宁愿不使用字典来执行此操作,因为在类似情况下我被提示这样做
所以我到目前为止是:
for line,name in zip(data,query):
if bool(line[0] == '>' and re.search(name,line))==True:
#but then i'm stuck because len(query) and len(data) are not equal
....任何帮助将不胜感激``