我开始深入研究 Perl,但是在编写“Perl-ly”代码而不是用 Perl 编写 C 时遇到了麻烦。如何更改以下代码以使用更多 Perl 习语,我应该如何学习这些习语?
只是对它在做什么的解释:此例程是对齐 DNA 或氨基酸序列的模块的一部分(如果您关心此类事情,请使用 Needelman-Wunch)。它创建了两个二维数组,一个用于存储两个序列中每个位置的分数,另一个用于跟踪路径,以便以后可以重新创建得分最高的对齐方式。它工作得很好,但我知道我做的事情不是很简洁明了。
编辑:这是为了分配。我完成了它,但想稍微清理一下我的代码。如果您有兴趣,可以在课程网站上找到有关实现该算法的详细信息。
sub create_matrix {
my $self = shift;
#empty array reference
my $matrix = $self->{score_matrix};
#empty array ref
my $path_matrix = $self->{path_matrix};
#$seq1 and $seq2 are strings set previously
my $num_of_rows = length($self->{seq1}) + 1;
my $num_of_columns = length($self->{seq2}) + 1;
#create the 2d array of scores
for (my $i = 0; $i < $num_of_rows; $i++) {
push(@$matrix, []);
push(@$path_matrix, []);
$$matrix[$i][0] = $i * $self->{gap_cost};
$$path_matrix[$i][0] = 1;
}
#fill out the first row
for (my $i = 0; $i < $num_of_columns; $i++) {
$$matrix[0][$i] = $i * $self->{gap_cost};
$$path_matrix[0][$i] = -1;
}
#flag to signal end of traceback
$$path_matrix[0][0] = 2;
#double for loop to fill out each row
for (my $row = 1; $row < $num_of_rows; $row++) {
for (my $column = 1; $column < $num_of_columns; $column++) {
my $seq1_gap = $$matrix[$row-1][$column] + $self->{gap_cost};
my $seq2_gap = $$matrix[$row][$column-1] + $self->{gap_cost};
my $match_mismatch = $$matrix[$row-1][$column-1] + $self->get_match_score(substr($self->{seq1}, $row-1, 1), substr($self->{seq2}, $column-1, 1));
$$matrix[$row][$column] = max($seq1_gap, $seq2_gap, $match_mismatch);
#set the path matrix
#if it was a gap in seq1, -1, if was a (mis)match 0 if was a gap in seq2 1
if ($$matrix[$row][$column] == $seq1_gap) {
$$path_matrix[$row][$column] = -1;
}
elsif ($$matrix[$row][$column] == $match_mismatch) {
$$path_matrix[$row][$column] = 0;
}
elsif ($$matrix[$row][$column] == $seq2_gap) {
$$path_matrix[$row][$column] = 1;
}
}
}
}