1

TL:DR - 我的有效 JSON 日志被 Logstash 拒绝,并抱怨 JSON 由于某些转义字符而无效。不幸的是,我无法弄清楚问题所在。

完整版本:

我的 Logstash(+Logstash-Forwarder) 从配置如下的自定义日志格式中获取 Apache 日志:

LogFormat "{ \
    \"@timestamp\": \"%{%Y-%m-%dT%H:%M:%S%z}t\", \
    \"@version\": \"1\", \
    \"vhost\":\"%V\", \
    \"tags\":[\"apache-json\"], \
    \"message\": \"%h %l %u %t \\\"%r\\\" %>s %b\", \
    \"clientip\": \"%a\", \
    \"duration\": %D, \
    \"status\": %>s, \
    \"request\": \"%U%q\", \
    \"urlpath\": \"%U\", \
    \"urlquery\": \"%q\", \
    \"bytes\": %B, \
    \"method\": \"%m\", \
    \"referer\": \"%{Referer}i\", \
    \"useragent\": \"%{User-agent}i\" \
}" ls_apache_json

相关的 Logstash 输入/过滤器配置非常简单:

input {
    lumberjack {
        port => 5000
        type => "logs"
    }
}
filter {
    if [type] =~ /-json$/ {
        json {
            source => "message"
        }
    }
}

而且似乎有些日志无法解析为 JSON - 我收到很多这样的错误:

{
   :timestamp=>"2015-08-27T12:47:05.165000+0200",
   :message=>"Trouble parsing json",
   :source=>"message",
   :raw=>"{ \t\"@timestamp\": \"2015-08-27T12:47:02+0200\", \t\"@version\": \"1\", \t\"vhost\":\"www.example.org\", \t\"tags\":[\"apache-json\"], \"clientip\": \"127.0.0.1\", \t\"duration\": 1280, \t\"status\": 200, \t\"request\": \"/uploads/_processed_/csm_D\\xc3\\xa4mpfungswanne_3_01_0c75c517e4.jpg\", \t\"urlpath\": \"/uploads/_processed_/csm_D\\xc3\\xa4mpfungswanne_3_01_0c75c517e4.jpg\", \t\"urlquery\": \"\", \t\"bytes\": 2913, \t\"method\": \"GET\", \t\"referer\": \"http://www.example.org/file.html\", \t\"useragent\": \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36\"    }",
   :exception=>#<LogStash::Json::ParserError: Unrecognized character escape 'x' (code 120) at [Source: [B@29a15d70; line: 1, column: 231]>
   :level=>:warn
}

我用一个简单的 Ruby 代码片段仔细检查了原始消息,它能够解析 JSON:

# irb
$ require 'json'
$ s= "{ \t\"@timestamp\": \"2015-08-27T12:47:02+0200\", \t\"@version\": \"1\", \t\"vhost\":\"www.example.org\", \t\"tags\":[\"apache-json\"], \"clientip\": \"127.0.0.1\", \t\"duration\": 1280, \t\"status\": 200, \t\"request\": \"/uploads/_processed_/csm_D\\xc3\\xa4mpfungswanne_3_01_0c75c517e4.jpg\", \t\"urlpath\": \"/uploads/_processed_/csm_D\\xc3\\xa4mpfungswanne_3_01_0c75c517e4.jpg\", \t\"urlquery\": \"\", \t\"bytes\": 2913, \t\"method\": \"GET\", \t\"referer\": \"http://www.example.org/file.html\", \t\"useragent\": \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36\"    }"
$ JSON.parse(s)
=> {"@timestamp"=>"2015-08-27T12:47:02+0200", "@version"=>"1", "vhost"=>"www.example.org", "tags"=>["apache-json"], "clientip"=>"127.0.0.1", "duration"=>1280, "status"=>200, "request"=>"/uploads/_processed_/csm_Dxc3xa4mpfungswanne_3_01_0c75c517e4.jpg", "urlpath"=>"/uploads/_processed_/csm_Dxc3xa4mpfungswanne_3_01_0c75c517e4.jpg", "urlquery"=>"", "bytes"=>2913, "method"=>"GET", "referer"=>"http://www.example.org/file.html", "useragent"=>"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36"}

我正在运行 Logstash 1.5.2。我认为这与编解码器有关,因为我尝试codec为 json 解析器设置参数并没有阻止问题。除了确保它使用正确的编解码器之外,Apache 还需要什么 - 我似乎找不到任何配置选项:(

欢迎任何帮助。

4

1 回答 1

0
#replace " and \
ruby {
 code => 'str=event["request_body"];   str=str.gsub("\\x22","\"").gsub("\\x5C", "\\"); event["request_body"]=str;'
}
于 2016-01-18T05:54:14.700 回答