1

我正在努力对程序进行编码,但无法检测到代码中有什么问题。

我有两个文件如下所示:

文件 1

Z   712 8571    +   A
X   712 714 +   A
Y   8569    8571    +   A
Z   24137   24264   +   B
X   24137   24139   +   B
Z   24322   24391   +   B
Z   24490   26064   +   B
Y   26062   26064   +   B
Z   26704   26740   +   C
X   26704   26706   +   C
Z   26814   27170   +   C
Z   27257   27978   +   C
Y   27976   27978   +   C
Z   30488   32170   +   D
X   30488   30490   +   D
Y   32168   32170   +   D
Z   32689   32811   +   E
X   32689   32691   +   E
Z   33038   33259   +   E
Z   33309   35147   +   E
Y   35145   35147   +   E

文件2

1 A
2 B
3 C
4 D
5 E

这是我的代码,我不知道这有什么问题及其麻烦。

use strict;

if (@ARGV != 2) {
    print "Invalid arguments";
    print "Usage: perl code.pl [file1][file2]";
    exit(0);
}

my $FILE1 = $ARGV[0];
    my %data1 = ();
    my $xyz = "";
    my $z_id = 0;
    my $start = 0;
    my $end = 0;
    my $positive = "";
    my $letter = "";

my $FILE2 = $ARGV[1];
    my %data2 = ();
    my $alphabet_id = 0;
    my $alphabet = "";

open (FILE1DATA, $FILE1);
open (FILE2DATA, $FILE2);

while (my $fileline1 = <FILE1DATA>) {
    chomp $fileline1;

    my @line1 = split /\t/, $fileline1;
    $xyz = $line1[0];
    if ($xyz eq "Z") {$z_id++;}
    $start = $line1[1];
    $end = $line1[2];
    $positive = $line1[3];
    $letter = $line1[4];

    $data1{$letter}{ZID} = $z_id;
    $data1{$letter}{XYZ} = $xyz;
    $data1{$letter}{START} = $start;
    $data1{$letter}{ENDD} = $end;
    $data1{$letter}{POSTIVE} = $positive;
    $data1{$letter}{LETTER} = $letter;

    while (my $fileline2 = <FILE2DATA>) {
        chomp $fileline2;

        my @line2 = split /\t/, $fileline2;
        $alphabet_id    = $line2[0];
        $alphabet = $line2[1];
        $data2{$alphabet}{ID} = $alphabet_id;
        $data2{$alphabet}{ALPHA} = $alphabet;
        foreach (%data2) {
            foreach ($data1{$letter}{LETTER}) {
                if ($data1{$letter}{LETTER} eq $data2{$alphabet}{ALPHA}){
                    $data1{$letter}{XYZ} = $data2{$alphabet}{ID};
                }
            }   
        }
    }
    print $data1{$letter}{ZID},"\t",$data1{$letter}{XYZ},"\t",$data1{$letter}{START},"\t",$data1{$letter}{ENDD},"\t",$data1{$letter}{POSTIVE},"\n";
}

close (FILE1DATA);
close (FILE2DATA);

exit;

如果我执行这个结果是这样的:

1   1   712 8571    +
1   X   712 714 +
1   Y   8569    8571    +
2   Z   24137   24264   +
2   X   24137   24139   +
3   Z   24322   24391   +
4   Z   24490   26064   +
4   Y   26062   26064   +
5   Z   26704   26740   +
5   X   26704   26706   +
6   Z   26814   27170   +
7   Z   27257   27978   +
7   Y   27976   27978   +
8   Z   30488   32170   +
8   X   30488   30490   +
8   Y   32168   32170   +
9   Z   32689   32811   +
9   X   32689   32691   +
10  Z   33038   33259   +
11  Z   33309   35147   +
11  Y   35145   35147   +

但它应该是这样的:

1   1   712 8571    +   
1   X   712 714 +
1   Y   8569    8571    +   
2   2   24137   24264   +   
2   X   24137   24139   +   
3   2   24322   24391   +   
4   2   24490   26064   +   
4   Y   26062   26064   +
5   3   26704   26740   +
5   X   26704   26706   +
6   3   26814   27170   +   
7   3   27257   27978   +
7   Y   27976   27978   +   
8   4   30488   32170   +   
8   X   30488   30490   +   
8   Y   32168   32170   +   
9   5   32689   32811   +   
9   X   32689   32691   +   
10  5   33038   33259   +   
11  5   33309   35147   +   
11  Y   35145   35147   +

这意味着当遇到“Z”时给字母“Z”一个id(跳过X和Y),并且当最后一列字母与字母相同时,必须将alphabet_id从file2返回到file1中的“Z”文件 2。

4

2 回答 2

0

这是一种不同的方法,可以根据您的数据集创建您想要的结果。评论了一些行:

use strict;
use warnings;

@ARGV == 1 or die "Invalid usage. Usage: perl $0 [file]";

my ( $z_id, $val ) = 0;

while (<>) {
    # Capture then delete last column letter and calc Z replacement val
    s/\s+(\w)$/$val = ( ord $1 ) - 64; ''/e;

    # Replace Z w/val and inc Z count
    $z_id++ if s/^Z/$val/;

    print "$z_id\t$_";
}

菱形运算符( <>) 读取名称通过 发送的文件@ARGV。请注意,“Z”是替换为 1-5 的替换的一部分。表达式在“A”( ord $1 ) - 64时计算为 1 $1,在“B”时计算为 2,依此类推,因此只需要主数据文件(您的文件 1)。如果 Z 替换s/^Z/$val/成功,$z_id则递增。

希望这可以帮助!

于 2012-12-22T04:57:05.853 回答
-1

这可能正在做你想做的事:

use strict;
use warnings;

if (@ARGV != 2) {
    print "Invalid arguments";
    print "Usage: perl code.pl [file1][file2]";
    exit(0);
}

my $FILE1 = $ARGV[0];
    my %data1 = ();
    my $xyz = "";
    my $z_id = 0;
    my $start = 0;
    my $end = 0;
    my $positive = "";
    my $letter = "";

my $FILE2 = $ARGV[1];
    my %data2 = ();
    my $alphabet_id = 0;
    my $alphabet = "";

open (FILE1DATA, $FILE1);
open (FILE2DATA, $FILE2);

while (my $fileline2 = <FILE2DATA>) {
  chomp $fileline2;

  my @line2 = split /\t/, $fileline2;
  $alphabet_id    = $line2[0];
  $alphabet = $line2[1];
  $data2{$alphabet}{ID} = $alphabet_id;
  $data2{$alphabet}{ALPHA} = $alphabet;
}


while (my $fileline1 = <FILE1DATA>) {
  chomp $fileline1;

  my @line1 = split /\t/, $fileline1;
  $xyz = $line1[0];
  $start = $line1[1];
  $end = $line1[2];
  $positive = $line1[3];
  $letter = $line1[4];

  if ($xyz eq 'Z') {
    $xyz = $data2{$letter}{ID};
    $z_id++;
  }
  print "$z_id\t$xyz\t$start\t$end\t$positive\n";
}


close (FILE1DATA);
close (FILE2DATA);

exit;
于 2012-12-22T05:36:40.637 回答