perl - 嵌套while循环比较两个文件，一次迭代后外部循环停止，需要比较每一行文件

Question

首先，如果我的格式不正确，我深表歉意，我对编写脚本非常陌生（3 天），这是我在这个网站上的第一篇文章。

我有两个制表符分隔的文件，File a包含 14 列，并File b包含 8 列。

中的一列File b有一个数值，该数值与中的两个数值字段生成的数字范围相关File a。对于中的每一行File a，我需要搜索File b并打印两个文件中字段的数据组合。File a由于接受了数字范围，因此每行将有多个匹配项。

我创建的代码完全符合我的要求，但仅适用于的第一行File a，并且不会继续循环。我已经查看了整个互联网，我相信这可能与这两个文件都是从标准输入读取的事实有关。我试图纠正这个问题，但我似乎无法得到任何工作

我目前的理解是，通过将一个文件更改为从不同的文件描述符中读取，我的循环可能会工作......>$3但我并不真正理解这一点，尽管我进行了研究。或者可能使用grep我也在努力的功能。

这是我现在使用的代码的大纲：

use strict;  
use warnings;

print "which file read from?\n";
my $filea = <STDIN>;  
chomp $filea;  
{
  unless (open ( FILEA, $filea) {
      print "cannot open, do you want to try again? y/n?\n?";  
      my $attempt = <STDIN>;  
      chomp $again;  
      if ($again =~ 'n') {
          exit;  
      } else {
          print "\n";   
          $filea = <STDIN>;  
          chomp $filea;  
          redo;  
      }
   }
}

#I also open fileb the same way, but wont write it all out to save space and your time.

my output = 'output.txt';  
open (OUTPUT, ">>$output");    

while (my $loop1 = <FILEA>) {  
    chomp $loop1;
    ( my $var1, my $var2, my $var3, my $var4, my $var5, my $var6, 
      my $var7, my $var8, my $var9, my $var10, my $var11, my $var12, 
      my $var13, my $var14 ) = split ( "\t", $loop1);

  #create the range of number which needs to be matched from file b.
  my $length = length ($var4);  
  my $range = ($var2 + $length);

  #perform the search loop through fileb
  while (my $loop2 = <FILEB>) {
      chomp $loop2;
    ( my $vala, my $valb, my $valc, my $vald, my $vale, my $valf, 
      my $valg) = split ( "\t", $loop2 );

    #there are then several functions and additions of the data, which all work basicly so I'll just use a quick example.

    if ($vald >= $val3 $$ $vald <= $range) {
        print OUTPUT "$val1, $vald, $val11, $valf, $vala, $val5 \n";
    }
  }
}

我希望这一切都是有道理的，如果有人可以帮助我编辑代码，以便循环继续通过所有 filea，我会尽量让一切变得清晰。

如果可能，请解释你做了什么。理想情况下，如果可以在不过多更改代码的情况下获得此结果，我会喜欢它。

多谢你们！！！

score 2 · Accepted Answer

尽可能避免裸露的把手；使用 $fh（文件句柄）而不是 FH

您可以使用 until 而不是 unless，并跳过重做：

print "Enter the file name\n";
my $file_a = <STDIN>;
chomp $file_a;
my $fh_a;
until(open $fh_a, '<', $file_a) {
    print "Re-enter the file name or 'n' to cancel\n";
    $file_a = <STDIN>;
    chomp $file_a;
    if($file_a eq 'n') {
        exit;
    }
}

您可以（应该）使用数组而不是所有这些单独的列变量：my @cols_a = split /\t/, $line;
您应该一次将文件 B 读入一个数组，然后在每次需要时搜索该数组：my @file_b = <$fh_b>;

结果将如下所示：

#Assume we have opened both files already . . .
my @file_b = <$fh_b>;
chomp @file_b;
while(my $line = <$fh_a>) {
    chomp $line;
    my @cols_a = split /\t/, $line;
    #Remember, most arrays (perl included) are zero-indexed,
    #so $cols_a[1] is actually the SECOND column.
    my $range = ($cols_a[1] + length $cols_a[3]);

    foreach my $line_b (@file_b) {
        #This loop will run once for every single line of file A.
        #Not efficient, but it will work.
        #There are, of course, lots of optimisations you can make
        #(starting with, for example, storing file B as an array of array
        #references so you don't have to split each line every time)
        my @cols_b = split /\t/, $line_b;
        if($cols_b[3] > $cols_a[2] && $cols_b[3] < ($cols_a[2] + $range)) {
            #Do whatever here
        }
    }
}

perl - 嵌套while循环比较两个文件，一次迭代后外部循环停止，需要比较每一行文件

1 回答 1

Related

Reference