0

我正在编写一个 perl 脚本来分析错误代码并确定它们是否是唯一的。该错误是唯一的,具体取决于它所在的行。标准错误消息可能是:

RT Warning: No condition matches in 'unique case' statement.
    "/user/foo/project", line 218, for ..

许多这些错误消息在我抓取的字符串中有多个数字。所以,我想要做的是抓住单词“line”之后第一次出现的数字,并且仅当数组中不存在该值时才将其添加到数组中。这是我到目前为止所得到的:

my $path = RT Warning: No condition matches in 'unique case' statement.
    "/user/foo/project", line 218
$path =~ m/(\d+)/;
print("Error occurs on line $1\n"); 
if(grep(/^$1$/, @RTarray))
{
    print("Not unique.\n");
}
else
{
    push(@RTarray, $1); 
    print("Found a unique error!\n");
}

所以,显然我没有检查它是否在关键字“line”之后,因为根据我目前处理正则表达式的方式,我不太确定如何做到这一点。此外,我认为我没有正确地将元素添加到我的数组中。请帮忙!

4

1 回答 1

2

您应该为此使用哈希。它具有内置的独特性,您甚至不必检查。

这是一个例子:

my %seen;

while (my $line = <$fh>) {

  if ($line =~ m/line (\d+)/) {
    my $ln = $1;
    if ( ! $seen{$ln}++ ) { 
      # this will check first and then increment. If it was encountered before,
      # it will already contain a true value, and thus the block will be skipped.
      # if it has not been encountered before, it will go into the block and...

      # do various operations on the line number
    }
  }

}

%seen现在包含所有有错误的行,以及每行有多少:

print Dumper \%seen:

$VAR1 = {
  10 => 1,
  255 => 5,
  1337 => 1,
}

这告诉我们第 10 行有一个错误,第 1337 行有一个错误。根据您的代码,这些是唯一的。第 255 行中的五个错误不是唯一的,因为在日志中出现了五次。


如果您想摆脱其中的一些,请使用delete删除整个键/值对,或$foo{$1}--递减或类似delete $foo{$1} unless --$foo{$1}递减并在一行中删除它。


编辑:我看过你的代码。实际上,唯一缺少的是正则表达式和引号。你真的试过了吗?有用。:)

my @RTarray;

while (my $line = <DATA>) {
  $line =~ m/line (\d+)/;
  print("Error occurs on line $1\n"); 
  if( grep { $_ eq $1 } @RTarray ) { # this eq is the same as your regex, just faster
    print("Not unique.\n");
  } else {
    print "Found a unique error in line $1!\n";
    push @RTarray, $1; 
  }
}

__DATA__
RT Warning: No condition matches in 'unique case' statement. "/user/foo/project", line 218, for
RT Warning: No condition matches in 'unique case' statement. "/user/foo/project", line 3, for
RT Warning: No condition matches in 'unique case' statement. "/user/foo/project", line 44, for
RT Warning: No condition matches in 'unique case' statement. "/user/foo/project", line 218, for
RT Warning: No condition matches in 'unique case' statement. "/user/foo/project", line 7, for
RT Warning: No condition matches in 'unique case' statement. "/user/foo/project", line 7, for
RT Warning: No condition matches in 'unique case' statement. "/user/foo/project", line 7, for

这将打印:

Error occurs on line 218
Found a unique error in line 218!
Error occurs on line 3
Found a unique error in line 3!
Error occurs on line 44
Found a unique error in line 44!
Error occurs on line 218
Not unique.
Error occurs on line 7
Found a unique error in line 7!
Error occurs on line 7
Not unique.

我认为这是正确的。我有 218 个双倍和 7 个三倍,它们都找到了。

我只用文件句柄循环替换了缺少引号的字符串,以便在多行上对其进行测试。我还修复了缺少 word line的正则表达式,但这个特定的错误消息甚至不需要。

于 2013-05-22T16:23:59.097 回答