regex - 使用两个数组，我需要检查一个的元素是否出现在另一个数组中，并分别打印匹配的元素

Question

F1.txt

bob
tom
harry

F2.txt

bob
a=1   b=2   c=3
bob
d=4   e=5   f=6
tom
a1=34  b1=32  c1=3443
tom
a2=534  b2=732  c2=673443

结果：

A1.txt

bob
a=1   b=2   c=3
bob
d=4   e=5   f=6

A2.txt

tom
a1=34  b1=32  c1=3443
tom
a2=534  b2=732  c2=673443

我是 PERL 新手，你能帮我解决我的问题吗？现在上面我提到了两个文件，即 F1.txt 和 F2.txt，我的工作是在 F2.txt 中搜索 F1.txt 的任何元素并打印对应的行和下一行。如果找到一个元素，则结果必须保存在一个新文件中，我给出了示例 A1.txt，它存储了有关 bob 的所有信息，同样，A2.txt 存储了有关 Tom 的所有信息。直到现在我已经尝试过这段代码，但它不能有效地工作，

use strict; 
use warnings;  

my $line1; 
my $line2; 
my $fh; 
my $fh1; 
my $counter;  

open  $fh, "<", "F1.txt" or die $!; 
open  $fh1, "<", "F2.txt" or die $!;  

my @b = <$fh>; 
my @a = <$fh1>;  

for (@b) 
{     
  $line1 = $_;     

  for (@a)     
  {         
    $line2 = $_;         
    if ($line1 =~ /^$line2$/)         
    { 
      $counter++;             
      open my $outfile, ">>", "A_${counter}.txt";             
      print $outfile $line2;             
      close $outfile;         
    }
  }
}

score 5 · Accepted Answer

每当您检查重复元素时，请考虑hash。例如，假设您有两个文件：

 File #1      File #2
 Bob          Tom
 Ted          Dick
 Alice        Harry
 Carol        Ted

如果您的工作是在文件 #2 中找到也在文件 #1 中的名称，您可以将文件 #1 中的名称存储在哈希中，然后在浏览文件 #2 时，查看是否有任何名称匹配你的哈希。

首先，让我们阅读文件#1：

 use strict;
 use warnings;
 use autodie;  #This way, I don't have to check open statements

 open my $file_1, "<", "file_1";
 my %first_file_name_hash;
 while my $name (<$file_1>) {
    chomp $name;
    $first_file_name_hash{$name} = 1;
 }
 close $file_1;

现在，%first_file_name_hash包含文件 #1 中的所有名称。

现在让我们打开文件 #2，然后检查一下：

open my $file_2, "<" "file_2";
while my $name (<$file_2>) {
   if ($first_file_name_hash) {
       print "User is in file #1 and file #2\n";
   }
}
close $file_2;

是的，这不是您想要的，但它让您了解如何存储哈希值。

哈希具有与值关联的键。每个条目必须有一个唯一的键。但是，散列中的每个条目都可能有重复的值。这是一个简单的哈希：

 $hash{BOB} = "New York";
 $hash{CAROL} = "New York";
 $hash{TED} = "Los Angeles";
 $hash{ALICE} = "Chicago";

在上面，两者$hash{BOB}和$hash{CAROL}具有相同的值（纽约）。但是，在散列中只能有一个BOB或。CAROL

散列的最大优点是通过键访问元素非常容易。你知道关键，你可以很容易地调出元素。

在你的情况下，我会使用两个哈希。在第一个哈希中，我会将第一个文件中每个人的姓名保存在 $HASH_1 中。我会将每个人的姓名保存在 $HASH_2 中的第二个文件中。不仅如此，我还将 $HASH_2 的值作为文件的下一行。

这会给你：

$HASH_1{bob} = 1;
$HASH_1{tom} = 1;
$HASH_1{harry} = 1; 

$HASH_2{bob} = a=1  b=2  c=3
               d=4  e=5  f=6
$HASH_2{tom} = a1=34  b1=32  c1=3443
               a2=534   b2=732   c2=673443

注意：哈希中的每个条目都有一个值的限制，因此当您有两行或多行的键为时bob，您必须找出处理它们的方法。在这种情况下，如果密钥已存在于中$HASH_2，我只需将其附加到 NL 值之后。

在现代 Perl 中，您可以将数组存储在 hash中，但您是 Perl 程序员的初学者，所以我们将坚持使用更简单的技巧。

这是一个完全未经测试的程序：

use strict;
use warnings;
use autodie;
use feature qw(say);   #Better print that print

# Read in File #1
open my $file_1, "<", "F1.txt";
my %hash_1;
while my $name (<$file_1>) {
   chomp $name;
   $hash_1{$name} = 1;
}
close $file_1;

# Read in File #2 -- a bit trickier

open my $file_2, "<", "F2.txt";
my %hash_2;
while my $name (<$file_2>) {
    chomp $name;
    my $value = <$file_2>;              #The next line
    chomp $value;
    next if not exists $hash_1{$name};  #Not interested if it's not in File #1
    if (exists $hash_2{$name}) {        #We've seen this before!
        $hash_2{$name} = $hash_2{$name} . "\n" . $value; #Appending value
    }
    else {
        $hash_2{$name} = $value;
    }
 }
 close $file_2;

现在，我们拥有我们想要的数据。%hash_2包含您需要的一切。我只是不确定你想如何打印出来。但是，它会是这样的：

 my $counter = 1;    #Used for file numbering..
 foreach my $name (sort keys %hash_2) {
    open my $file, ">", "A" . $counter . "txt";
    say $file "$name";     #Name of person
    my @lines = split /\n/, $hash_2{$key};  #Lines in our hash value
    foreach my $line (@lines) {
      say $file "$line";
    }
    close $file;
    $counter++;
 }

请注意，通过使用哈希，我避免了双 for 循环，这最终会占用大量时间。我只经历三个循环：前两个在每个文件中读取。最后一个通过第二个文件的哈希，并将其打印出来。

使用哈希是跟踪您已阅读的数据的好方法。

regex - 使用两个数组，我需要检查一个的元素是否出现在另一个数组中，并分别打印匹配的元素

1 回答 1

Related

Reference