0

I'm trying to grep a specific line with domain from Apache2 access.log. In my access.log I have all my virtual hosts and different domains.

cat/var/log/access.log:

www.something-else-domain.si:80 193.77.xxx. xxx - - [06/Nov/2013:12:21:45 +0100] "GET /path/to/dir/image.jpg HTTP/1.1" 304 - "www.something-else-domain.si/index.php" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:25.0) Gecko/20100101 Firefox/25.0"

www.domain.si:80 193.77.xxx. xxx - - [06/Nov/2013:12:21:45 +0100] "GET /path/to/dir/image. jpg HTTP/1.1" 304 - "www.domain.si/index.php" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:25.0) Gecko/20100101 Firefox/25.0"

domain.si:80 193.77.xxx. xxx - - [06/Nov/2013:12:21:45 +0100] "GET /path/to/dir/image. jpg HTTP/1.1" 304 - "www.domain.si/index.php" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:25.0) Gecko/20100101 Firefox/25.0"

I would want to grep only the domain.si and www.domain.si and whatever.domain.si and not something-else-domain.si. How could I do that? Thanks for help.

4

2 回答 2

2
egrep '^([^ ]*\.)?domain\.si' /var/log/access.log

把这个分开:

  • ^是行的开头。
  • (xxx)?是“匹配xxx或不匹配”;在这种情况下,匹配:
    • 什么都没有,这是裸域名的情况(domain.si
    • [^ ]*\., 任何不是空格的字符串,后跟一个点。这匹配可选的www.whatever.部分。
  • domain\.si简单地匹配domain.si零件。

与 的锚定^以及“无空格”位确保您仅匹配行开头的内容(而不是请求,如GET /domain.si)。

于 2013-11-07T19:40:58.363 回答
0

一个gnu awk解决方案

awk  '/www.domain$|domanin$/ {print $NF RS}' RS=".si"
www.domain.si
"www.domain.si
"www.domain.si

你的例子有问题。空间不允许url

于 2013-11-07T19:25:06.323 回答