2013 年 5 月 10 日更新
好的,现在我可以毫无问题地过滤掉 IP 地址。现在来接下来我想做的三件事,我认为可以很容易地完成sort($keys)
,但我错了,然后尝试下面稍微复杂一点的方法似乎也不是解决方案。接下来我需要完成的是收集日期和浏览器版本。我将提供我的日志文件格式和当前代码的示例。
阿帕奇日志
24.235.131.196 - - [10/Mar/2004:00:57:48 -0500] "GET http://www.google.com/iframe.php HTTP/1.0" 500 414 "http://www.google.com/iframe.php" "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)"
我的代码
#!usr/bin/perl -w
use strict;
my %seen = ();
open(FILE, "< access_log") or die "unable to open file $!";
while( my $line = <FILE>) {
chomp $line;
# regex for ip address.
if( $line =~ /(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/ ) {
$seen{$1}++;
}
#regex for date an example is [09\Mar\2009:05:30:23]
if( $line =~ /\[[\d]{2}\\.*[\d]{4}\:[\d]{2}\:[\d]{2}\]*/) {
print "\n\n $line matched : $_\n";
}
}
close FILE;
my $i = 0;
# program bugs out if I uncomment the below line,
# but to my understanding this is essentially what I'm trying to do.
# for my $key ( keys %seen ) (keys %date) {
for my $key ( keys %seen ) {
my ($ip) = sort {$a cmp $b}($key);
# also I'd like to be able to sort the IP addresses and if
# I do it the proper numeric way it generates errors saying contents are not numeric.
print @$ip->[$i] . "\n";
# print "The IPv4 address is : $key and has accessed the server $seen{$key} times. \n";
$i++;
}