1

我正在尝试从文本文件中提取特定信息段,并将其写入另一个文件。以下是防火墙日志;对我来说唯一重要的信息是“inside/”之后的IP地址和端口以及“outside/”之后的IP地址和端口

May 24 10:21:53 10.110.9.18 v3306 %FWSM-4-106100: access-list inside permitted tcp inside/10.110.27.5(53264) -> outside/172.23.240.2(1984) hit-cnt 1 (1-second interval) [0xee13216c, 0x0] 

May 24 10:21:53 10.110.9.18 v3306 %FWSM-4-106100: access-list inside permitted tcp inside/10.110.27.5(53265) -> outside/10.110.2.5(1984) hit-cnt 1 (1-second interval) [0xee13216c, 0x0] 

我基本上希望输出最终如下所示:

10.110.27.5(53264) -> 172.23.240.2(1984)

如果还有一种方法可以删除重复项,那就太好了。

4

2 回答 2

4
perl -nE'@r= /(?:inside|outside)\/(\S+)/g and say join" -> ", @r' file

没有重复:

perl -nE'@r= /(?:inside|outside)\/(\S+)/g and !$s{"@r"}++ and say join" -> ", @r' file

或者

perl -nE'
  @r= /(?:inside|outside)\/(\S+)/g;
  if (@r and !$s{"@r"}++) { say join" -> ", @r }
' file
于 2013-11-05T07:05:27.210 回答
2

我将假设insideoutside都在同一行。您应该能够使用这样的循环扫描文件,找到匹配项:

open my $fh, "<", $logfile or die "can't open $logfile for reading\n";

my %seen;  # used for filtering dupes.

while (<$fh>)
{
    my $line = $_;

    if ($line =~ /inside\/([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+\([0-9]+\)).*outside\/([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+\([0-9]+\))/)
    {
        my $hit = "$1 -> $2";
        print $hit, "\n" if (++$seen{$hit} == 1);
    }
}
close $fh;

我认为这应该有效。

上面的正则表达式完全有可能过于具体。下面的代码更轻松一些:

open my $fh, "<", $logfile or die "can't open $logfile for reading\n";

my %seen;  # used for filtering dupes.

while (<$fh>)
{
    my $line = $_;

    if ($line =~ /(inside.*outside[^)]*\))/)
    {
        my $hit = $1;
        $hit =~ s/(inside|outside)\///g;  # remove 'inside/' and 'outside/' from string.
        print $hit, "\n" if (++$seen{$hit} == 1);
    }
}
close $fh;
于 2013-11-05T07:09:59.900 回答