1

我是 Perl 的新手。我正在尝试从一个与另一个文件中的行匹配的文件中提取 fasta 序列。两个示例文件如下:

文件1.fasta:

>gene_44|105_nt|+|47540|47644 GTGCGCCGGCGCGTCGCGATCGCGAACCGGCCCGTGCGAATCCTGCCGCATGCGCGCCGCATCTCGCCACGCCGCGCATTTCATTTCGACATCCATAACGTCTGA

>gene_69|111_nt|+|75846|75956 ATGCCGTTGCCGTCGCGCATCGCGGCGGCCGTGCGCGGCGCGCATGCATACGCCGGCACGGCCGATGCGCGCGCGACGCGCAAACTGCACGCGGCGCGGGATTTGTGTTGA

>gene_88|177_nt|-|97993|98169
ATGCGCCAGCCGACGCACGCCCATTCCGGGCGAAACGTTCCCCTTATCCATTCGATCATCCGTGCCGCACTGCGCGAAGCGGCCACCGCCGACACGTACCAAACCGCGCTCGATGCGACCGGCGCGGCACTCGTCGCCATCGCGGCGCTCGTGCGCGCGGAGGTGCGGCATGGCTGA

>gene_90|141_nt|-|99016|99156
TTGGAAGGGCGCTTTCCGCGTGCGAGTCGTCTGACGCAGCGTTGCACGGTCTGGTCGAATCGCGAGCTTCATCGCTGGATGGCCGATCCGTTGAACTATCGCGCTGTCGACGCGGCGAACCAGACGACGGAGGGCGCGTAA

文件2.列表:

somewordsinfront, >gene_44|somewordsattheback

blablabla,>gene_88|blablablablabla

我期望的输出如下:

>gene_44|105_nt|+|47540|47644 GTGCGCCGGCGCGTCGCGATCGCGAACCGGCCCGTGCGAATCCTGCCGCATGCGCGCCGCATCTCGCCACGCCGCGCATTTCATTTCGACATCCATAACGTCTGA

>gene_88|177_nt|-|97993|98169
ATGCGCCAGCCGACGCACGCCCATTCCGGGCGAAACGTTCCCCTTATCCATTCGATCATCCGTGCCGCACTGCGCGAAGCGGCCACCGCCGACACGTACCAAACCGCGCTCGATGCGACCGGCGCGGCACTCGTCGCCATCGCGGCGCTCGTGCGCGCGGAGGTGCGGCATGGCTGA

我怎样才能做到这一点?提前致谢!:)

4

1 回答 1

0

下次当你问问题时,请显示你的代码,例如

use strict;
use warnings;

my @genes;

open my $list, '<file2.list';
while (my $line = <$list>) {
    push (@genes, $1) if $line =~ /[^>]+>([^|]+)/;

}
my $input;
close $list;
{
    local $/ = undef;
    open my $fasta, '<file1.fasta';
    $input = <$fasta>;
    close $fasta;
}
my @lines = split(/>/,$input);
foreach my $l (@lines) {
    foreach my $reg (@genes) {
        print ">$l" if $l =~ /$reg/
    }
}
于 2013-04-06T14:02:06.647 回答