-3

给定 perl 脚本在“E”处剪切输入序列并跳过@nobreak 中提到的“E”的那些特定位置,并生成一个片段数组作为输出。但是我想要一个脚本,它为每个已跳过的位置在输出中生成一组这样的数组,并考虑到@nobreak 的所有位置。假设第 1 组包含在“E”37 处跳过后产生的片段,第 2 组在“E”45 处跳过后产生,依此类推。下面提到的我编写的脚本无法正常工作。我想在输出中生成 4 个不同的数组,一次占用一个 @nobreak 的位置。请帮忙!

my $s = 'MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKRGIVEQCCTSICSLYQLENYCN';

print "Results of 1-Missed Cleavage:\n\n";

my @nobreak = (37, 45, 57, 59);
{
    @nobreak = map { $_ - 1 } @nobreak;

    foreach (@nobreak) {

        substr($s, $_, 1) = "\0";
    } 
    my @a   = split /E(?!P)/, $s;
    $_      =~ s/\0/E/g foreach (@a);
    $result = join "E,", @a; 
    @final  = split /,/, $result;
    print "@final\n";
}
4

3 回答 3

2

要在每个 'E' 处拆分字符串而不在过程中使用它,请使用后视:

my @final = split /(?<=E)/, $str;

为了更好地控制要拆分的“E”(您未指定),将对正则表达式进行更改。

如果需要变量后视,可以使用\K...

于 2012-09-11T14:47:38.750 回答
1

循环@nobreak?

my $s = 'MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKRGIVEQCCTSICSLYQLENYCN';
print "Results of 1-Missed Cleavage:\n\n";
my @nobreak = (37,45,57,59);
for my $nobreak (@nobreak) {
    substr($s, $nobreak-1, 1) = "\0";
    my @a = split(/E(?!P)/, $s);
    substr($s, $nobreak-1, 1) = 'E';
    $_ =~ s/\0/E/g foreach (@a);
    $result = join ("E,", @a); 
    @final = split(/,/, $result);
    print "@final\n";
}
于 2012-09-11T08:43:20.157 回答
0

看起来您想在所有字符之后E拆分字符串,但不是在任何P字符之前

这段代码会做你想做的事。它的工作原理是将Eat 每个偏移量更改@nobreak为 an e(比"\0"调试要好得多)并拆分/(?<=E)(?!P)/- 即在 an 之后E但不在 a 之前Pe改回E之后使用tr/e/E/

use strict;
use warnings;

my $s = 'MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKRGIVEQCCTSICSLYQLENYCN';

print "Results of 1-Missed Cleavage:\n\n";

my @nobreak = (37, 45, 57, 59);

for my $index (@nobreak) {
  my $ss = $s;
  substr($ss, $index-1, 1) = 'e';
  my @final = split /(?<=E)(?!P)/, $ss;
  tr/e/E/ for @final;
  print "$_\n" for @final;
  print "\n";
}

输出

Results of 1-Missed Cleavage:

MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGE
RGFFYTPKTRRE
AE
DLQVGQVE
LGGGPGAGSLQPLALE
GSLQKRGIVE
QCCTSICSLYQLE
NYCN

MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVE
ALYLVCGERGFFYTPKTRRE
AE
DLQVGQVE
LGGGPGAGSLQPLALE
GSLQKRGIVE
QCCTSICSLYQLE
NYCN

MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVE
ALYLVCGE
RGFFYTPKTRREAE
DLQVGQVE
LGGGPGAGSLQPLALE
GSLQKRGIVE
QCCTSICSLYQLE
NYCN

MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVE
ALYLVCGE
RGFFYTPKTRRE
AEDLQVGQVE
LGGGPGAGSLQPLALE
GSLQKRGIVE
QCCTSICSLYQLE
NYCN
于 2012-09-12T16:06:01.513 回答