我正在努力对程序进行编码,但无法检测到代码中有什么问题。
我有两个文件如下所示:
文件 1
Z 712 8571 + A
X 712 714 + A
Y 8569 8571 + A
Z 24137 24264 + B
X 24137 24139 + B
Z 24322 24391 + B
Z 24490 26064 + B
Y 26062 26064 + B
Z 26704 26740 + C
X 26704 26706 + C
Z 26814 27170 + C
Z 27257 27978 + C
Y 27976 27978 + C
Z 30488 32170 + D
X 30488 30490 + D
Y 32168 32170 + D
Z 32689 32811 + E
X 32689 32691 + E
Z 33038 33259 + E
Z 33309 35147 + E
Y 35145 35147 + E
和
文件2
1 A
2 B
3 C
4 D
5 E
这是我的代码,我不知道这有什么问题及其麻烦。
use strict;
if (@ARGV != 2) {
print "Invalid arguments";
print "Usage: perl code.pl [file1][file2]";
exit(0);
}
my $FILE1 = $ARGV[0];
my %data1 = ();
my $xyz = "";
my $z_id = 0;
my $start = 0;
my $end = 0;
my $positive = "";
my $letter = "";
my $FILE2 = $ARGV[1];
my %data2 = ();
my $alphabet_id = 0;
my $alphabet = "";
open (FILE1DATA, $FILE1);
open (FILE2DATA, $FILE2);
while (my $fileline1 = <FILE1DATA>) {
chomp $fileline1;
my @line1 = split /\t/, $fileline1;
$xyz = $line1[0];
if ($xyz eq "Z") {$z_id++;}
$start = $line1[1];
$end = $line1[2];
$positive = $line1[3];
$letter = $line1[4];
$data1{$letter}{ZID} = $z_id;
$data1{$letter}{XYZ} = $xyz;
$data1{$letter}{START} = $start;
$data1{$letter}{ENDD} = $end;
$data1{$letter}{POSTIVE} = $positive;
$data1{$letter}{LETTER} = $letter;
while (my $fileline2 = <FILE2DATA>) {
chomp $fileline2;
my @line2 = split /\t/, $fileline2;
$alphabet_id = $line2[0];
$alphabet = $line2[1];
$data2{$alphabet}{ID} = $alphabet_id;
$data2{$alphabet}{ALPHA} = $alphabet;
foreach (%data2) {
foreach ($data1{$letter}{LETTER}) {
if ($data1{$letter}{LETTER} eq $data2{$alphabet}{ALPHA}){
$data1{$letter}{XYZ} = $data2{$alphabet}{ID};
}
}
}
}
print $data1{$letter}{ZID},"\t",$data1{$letter}{XYZ},"\t",$data1{$letter}{START},"\t",$data1{$letter}{ENDD},"\t",$data1{$letter}{POSTIVE},"\n";
}
close (FILE1DATA);
close (FILE2DATA);
exit;
如果我执行这个结果是这样的:
1 1 712 8571 +
1 X 712 714 +
1 Y 8569 8571 +
2 Z 24137 24264 +
2 X 24137 24139 +
3 Z 24322 24391 +
4 Z 24490 26064 +
4 Y 26062 26064 +
5 Z 26704 26740 +
5 X 26704 26706 +
6 Z 26814 27170 +
7 Z 27257 27978 +
7 Y 27976 27978 +
8 Z 30488 32170 +
8 X 30488 30490 +
8 Y 32168 32170 +
9 Z 32689 32811 +
9 X 32689 32691 +
10 Z 33038 33259 +
11 Z 33309 35147 +
11 Y 35145 35147 +
但它应该是这样的:
1 1 712 8571 +
1 X 712 714 +
1 Y 8569 8571 +
2 2 24137 24264 +
2 X 24137 24139 +
3 2 24322 24391 +
4 2 24490 26064 +
4 Y 26062 26064 +
5 3 26704 26740 +
5 X 26704 26706 +
6 3 26814 27170 +
7 3 27257 27978 +
7 Y 27976 27978 +
8 4 30488 32170 +
8 X 30488 30490 +
8 Y 32168 32170 +
9 5 32689 32811 +
9 X 32689 32691 +
10 5 33038 33259 +
11 5 33309 35147 +
11 Y 35145 35147 +
这意味着当遇到“Z”时给字母“Z”一个id(跳过X和Y),并且当最后一列字母与字母相同时,必须将alphabet_id从file2返回到file1中的“Z”文件 2。