0

我有这样的访问日志,我想抓住他们中的每一个人,然后对它们进行排序,找到最多的一个。

173.192.238.41 - - [28/Feb/2013:07:06:09 -0500] "GET / HTTP/1.1" 200 20644 "-" "Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.19; aggregator:Spinn3r (Spinn3r 3.1); http://spinn3r.com/robot) Gecko/2010040121 Firefox/3.0.19"
208.115.113.84 - - [28/Feb/2013:07:06:19 -0500] "GET /tag/bright HTTP/1.1" 404 327 "-" "Mozilla/5.0 (compatible; Ezooms/1.0; ezooms.bot@gmail.com)"
94.228.34.214 - - [28/Feb/2013:07:10:16 -0500] "GET /alli-comes-home-12-10-09-day-224-2264/feed HTTP/1.1" 404 359 "-" "magpie-crawler/1.1 (U; Linux amd64; en-GB; +http://www.brandwatch.net)"
209.171.42.71 - - [28/Feb/2013:07:11:19 -0500] "GET /feed/atom HTTP/1.1" 404 326 "-" "Mozilla/5.0 (compatible; BlogScope/1.0; +http://www.blogscope.net/; U of Toronto)"
94.228.34.229 - - [28/Feb/2013:07:12:48 -0500] "GET /the-latest-design-franck-muller-watches-and-versace-watches-6838/feed HTTP/1.1" 404 386 "-" "magpie-crawler/1.1 (U; Linux amd64; en-GB; +http://www.brandwatch.net)"

我可以像这样对它进行分类吗?

cat /path/to/access.log | awk '{print $1}' | sort | uniq -c
4

3 回答 3

4

你很近。数完之后,你必须按计数排序:

awk '{print $1}' /path/to/access.log | sort | uniq -c | sort -n

您也可以在 awk 中进行计数,而不是使用sortand uniq

awk '{count[$1]++} END {for (ip in count) print count[ip], ip;}' | sort -n
于 2013-03-01T03:38:14.247 回答
0

这是您可以按出现然后按地址排序 IPv4 地址的一种方法:

# cut takes only the first column from access.log
<access.log cut -d' ' -f1 |

# Presort the IP addresses so uniq can count them 
sort    |
uniq -c |

# Format the stream so it only contains `.' delimiters
sed 's/^ *//; s/ /./' |

# Now sort numerically based on each consecutive dot delimited column
sort -t. -k1,1n -k2,2n -k3,3n -k4,4n -k5,5n | 

# Reset the first delimter  
sed 's/\./ /'

测试输入:

cat << EOF > access.log
173.192.238.41 - - [28/Feb/2013:07:06:09 -0500] "GET / HTTP/1.1" 200 20644 "-" "Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.19; aggregator:Spinn3r (Spinn3r 3.1); http://spinn3r.com/robot) Gecko/2010040121 Firefox/3.0.19"
208.115.113.84 - - [28/Feb/2013:07:06:19 -0500] "GET /tag/bright HTTP/1.1" 404 327 "-" "Mozilla/5.0 (compatible; Ezooms/1.0; ezooms.bot@gmail.com)"
94.228.34.229 - - [28/Feb/2013:07:12:48 -0500] "GET /the-latest-design-franck-muller-watches-and-versace-watches-6838/feed HTTP/1.1" 404 386 "-" "magpie-crawler/1.1 (U; Linux amd64; en-GB; +http://www.brandwatch.net)"
94.228.34.214 - - [28/Feb/2013:07:10:16 -0500] "GET /alli-comes-home-12-10-09-day-224-2264/feed HTTP/1.1" 404 359 "-" "magpie-crawler/1.1 (U; Linux amd64; en-GB; +http://www.brandwatch.net)"
209.171.42.71 - - [28/Feb/2013:07:11:19 -0500] "GET /feed/atom HTTP/1.1" 404 326 "-" "Mozilla/5.0 (compatible; BlogScope/1.0; +http://www.blogscope.net/; U of Toronto)"
209.71.42.71 - - [28/Feb/2013:07:11:19 -0500] "GET /feed/atom HTTP/1.1" 404 326 "-" "Mozilla/5.0 (compatible; BlogScope/1.0; +http://www.blogscope.net/; U of Toronto)"
94.228.34.229 - - [28/Feb/2013:07:12:48 -0500] "GET /the-latest-design-franck-muller-watches-and-versace-watches-6838/feed HTTP/1.1" 404 386 "-" "magpie-crawler/1.1 (U; Linux amd64; en-GB; +http://www.brandwatch.net)"
94.229.34.229 - - [28/Feb/2013:07:12:48 -0500] "GET /the-latest-design-franck-muller-watches-and-versace-watches-6838/feed HTTP/1.1" 404 386 "-" "magpie-crawler/1.1 (U; Linux amd64; en-GB; +http://www.brandwatch.net)"
94.227.34.229 - - [28/Feb/2013:07:12:48 -0500] "GET /the-latest-design-franck-muller-watches-and-versace-watches-6838/feed HTTP/1.1" 404 386 "-" "magpie-crawler/1.1 (U; Linux amd64; en-GB; +http://www.brandwatch.net)"
EOF

输出:

1 94.227.34.229
1 94.228.34.214
1 94.229.34.229
1 173.192.238.41
1 208.115.113.84
1 209.71.42.71
1 209.171.42.71
2 94.228.34.229
于 2013-03-01T08:01:01.087 回答
0
awk '{a[$1]++}END{for(i in a)print a[i],i}' your_log|sort -rn

或者

perl -lane '$x{$F[0]}++;END{for(keys %x){print $x{$_}." ".$_;}}' your_log|sort -rn
于 2013-03-01T07:11:49.687 回答