所以我试图找到给定特定模式的最长重复。到目前为止,我的代码看起来像这样,并且相当接近,但是它并没有完全给出想要的结果:
use warnings;
use strict;
my $DNA;
$DNA = "ATATCCCACTGTAGATAGATAGAATATATATATATCCCAGCT" ;
print "$DNA\n" ;
print "The longest AT repeat is " . longestRepeat($DNA, "AT") . "\n" ;
print "The longest TAGA repeat is " . longestRepeat($DNA, "TAGA") . "\n" ;
print "The longest C repeat is " . longestRepeat($DNA, "C") . "\n" ;
sub longestRepeat{
my $someSequence = shift(@_); # shift off the first argument from the list
my $whatBP = shift(@_); # shift off the second argument from the list
my $match = 0;
if ($whatBP eq "AT"){
while ($someSequence =~ m/$whatBP/g) {
$match = $match + 1;
}
return $match;
}
if ($whatBP eq "TAGA"){
while ($someSequence =~ m/$whatBP/g) {
$match = $match + 1;
}
return $match;
}
if ($whatBP eq "C"){
while ($someSequence =~ m/$whatBP/g) {
$match = $match + 1;
}
return $match;
}
}
它现在所做的只是在序列中查找 TOTAL AT、TAGA、C 的数量。它不是只给我最长的长度,而是总结它们并给我总数。我认为while循环中有问题,但是我不确定。任何帮助将不胜感激。
ps 它还应该以字符串形式显示最长的重复,而不是数字形式(这里可能使用 substr)。