perl - 检查文件中是否存在模式

Question

我有一个关于模式匹配问题的非常简单的 perl 问题。我正在阅读带有名称列表的文件（fileA）。我想检查这些名称中的任何一个是否存在于另一个文件（fileB）中。

if ($name -e $fileB){
    do something
}else{
    do something else
}

它是一种检查文件中是否存在模式的方法。我努力了

open(IN, $controls) or die "Can't open the control file\n";
    while(my $line = <IN>){
            if ($name =~ $line ){
                    print "$name\tfound\n";
            }else{
                    print "$name\tnotFound\n";
            }
    }

当它检查并打印每个条目而不是检查名称是否存在时，这种情况会不断重复。

score 1 · Accepted Answer

To check whether a pattern exists in a file, you have to open the file and read its content. The fastest way how to search for inclusion of two lists is to store the content in a hash:

#!/usr/bin/perl
use strict;
use warnings;

open my $LST, '<', 'fileA' or die "fileA: $!\n";
open my $FB,  '<', 'fileB' or die "fileB: $!\n";

my %hash;
while (<$FB>) {
    chomp;
    undef $hash{$_};
}

while (<$LST>) {
    chomp;
    if (exists $hash{$_}) {
        print "$_ exists in fileB.\n";
    }
}

score 1 · Accepted Answer

当您将一个列表与另一个列表进行比较时，您会对哈希感兴趣。散列是一个带键的数组，列表本身没有顺序。散列只能具有特定键的单个实例（但不同的键可以具有相同的数据）。

您可以做的是浏览第一个文件，并创建一个由该行键入的哈希。然后，您浏览第二个文件夹并检查这些行是否与哈希中的任何键匹配：

#! /usr/bin/env perl

use strict;
use warnings;
use feature qw(say);
use autodie;  #You don't have to check if "open" fails.

use constant {
    FIRST_FILE   => 'file1.txt',
    SECOND_FILE  => 'file2.txt',
};
open my $first_fh, "<", FIRST_FILE;

# Get each line as a hash key
my %line_hash;
while ( my $line = <$first_fh> ) {
    chomp $line;
    $line_hash{$line} = 1;
}
close $first_fh;

现在每一行都是你的 hash 中的一个键%line_hash。数据真的无所谓。重要的部分是密钥本身的值。

现在我有了第一个文件中行的散列，我可以读取第二个文件并查看该行是否存在于我的散列中：

open my $second_fh, "<", SECOND_FILE;
while ( my $line = <$second_fh> ) {
    chomp $line;
    if ( exists $line_hash{$line} ) {
        say qq(I found "$line" in both files);
    }
}
close $second_fh;

还有一个map可以使用的功能：

#! /usr/bin/env perl

use strict;
use warnings;
use feature qw(say);
use autodie;  #You don't have to check if "open" fails.

use constant {
    FIRST_FILE   => 'file1.txt',
    SECOND_FILE  => 'file2.txt',
};
open my $first_fh, "<", FIRST_FILE
chomp ( my @lines = <$first_fh> );

# Get each line as a hash key
my %line_hash = map { $_ => 1 } @lines;
close $first_fh;

open my $second_fh, "<", SECOND_FILE;
while ( my $line = <$second_fh> ) {
    chomp $line;
    if ( exists $line_hash{$line} ) {
        say qq(I found "$line" in both files);
    }
}
close $second_fh;

我不是它的忠实粉丝，map因为我发现它的效率并没有那么高，而且更难理解发生了什么。

score 0 · Accepted Answer

我刚刚给出了一种未经测试的算法代码。但我觉得这对你有用。

my @a;
my $matched
my $line;
open(A,"fileA");
open(A,"fileB");
while(<A>)
{
    chomp;
    push @a,$_;
}
while(<B>)
{
    chomp;
    $line=$_;
    $matched=0;
    for(@a){if($line=~/$_/){last;$matched=1}}
    if($matched)
    {
        do something
    }
    else
    {
        do something else
    }
}

perl - 检查文件中是否存在模式

3 回答 3

Related

Reference