2

我有 tomcat 访问日志,其中包含以下条目:

50.47.142.25 - - [07/May/2012:00:00:14 +0000] 0 "GET /mywebpage/blah.jsp " 200 123 "-" "-"

我希望将所有条目放在 SQL 表中,然后在其上运行 SQL 查询。

我正在考虑使用 GAWK (gnu AWK) 来获取 CSV 格式的所有行。就像是:

gawk '{print $1 ", " $2 ", " , " $3 ", " $4 ", " $5 ", " $6 ", " $7 ", " $8 ", " $9 ", " $9}'

给我

50.47.142.25, -, -, [11/May/2012:08:51:02, 0, "GET /mywebpage/blah.jsp" 200, 123, -, -

这让我接近 SQL 插入语句。除了,我需要日期时间的格式:

2012-05-11 08:51:02

即没有前导方括号和SQLServer 希望它的格式。有什么提示吗?

4

1 回答 1

3
#!/usr/bin/awk -f
BEGIN {
    monthlist = "Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec"
    c = split(monthlist, monthsarr)
    for (i = 1; i <= c; i++) {
        months[monthsarr[i]] = i
    }
    fieldlist = "1 2 3 5 8 10 11 14 15 17 20"
    fieldcount = split(fieldlist, fields)
    OFS = ","
}

{
    delim = ""
    c = split($0, logarr, /[ \[\]"]/)
    split(logarr[5], datearr, /[/:]/)
    ts = mktime(datearr[3] " " months[datearr[2]] " " datearr[1] " " datearr[4] " " datearr[5] " " datearr[6])
    logarr[5] = strftime("%F %T", ts)
    for (f = 1; f <= fieldcount; f++) {
        printf "%s%s", delim, logarr[fields[f]]
        delim = OFS
    }
    printf "\n"
}

根据您的示例日志条目,输出如下所示:

50.47.142.25,-,-,2012-05-07 00:00:14,0,GET,/mywebpage/blah.jsp,200,123,-,-

引号和方括号被丢弃,因为它们与空格一起用作字段分隔符。此外,这会创建很多错误字段,因此我使用字段列表进行迭代。

请注意,mktime()strftime()函数是特定于 GNU AWK ( gawk) 的。

于 2012-05-15T13:53:32.593 回答