TL:DR - 我的有效 JSON 日志被 Logstash 拒绝,并抱怨 JSON 由于某些转义字符而无效。不幸的是,我无法弄清楚问题所在。
完整版本:
我的 Logstash(+Logstash-Forwarder) 从配置如下的自定义日志格式中获取 Apache 日志:
LogFormat "{ \
\"@timestamp\": \"%{%Y-%m-%dT%H:%M:%S%z}t\", \
\"@version\": \"1\", \
\"vhost\":\"%V\", \
\"tags\":[\"apache-json\"], \
\"message\": \"%h %l %u %t \\\"%r\\\" %>s %b\", \
\"clientip\": \"%a\", \
\"duration\": %D, \
\"status\": %>s, \
\"request\": \"%U%q\", \
\"urlpath\": \"%U\", \
\"urlquery\": \"%q\", \
\"bytes\": %B, \
\"method\": \"%m\", \
\"referer\": \"%{Referer}i\", \
\"useragent\": \"%{User-agent}i\" \
}" ls_apache_json
相关的 Logstash 输入/过滤器配置非常简单:
input {
lumberjack {
port => 5000
type => "logs"
}
}
filter {
if [type] =~ /-json$/ {
json {
source => "message"
}
}
}
而且似乎有些日志无法解析为 JSON - 我收到很多这样的错误:
{
:timestamp=>"2015-08-27T12:47:05.165000+0200",
:message=>"Trouble parsing json",
:source=>"message",
:raw=>"{ \t\"@timestamp\": \"2015-08-27T12:47:02+0200\", \t\"@version\": \"1\", \t\"vhost\":\"www.example.org\", \t\"tags\":[\"apache-json\"], \"clientip\": \"127.0.0.1\", \t\"duration\": 1280, \t\"status\": 200, \t\"request\": \"/uploads/_processed_/csm_D\\xc3\\xa4mpfungswanne_3_01_0c75c517e4.jpg\", \t\"urlpath\": \"/uploads/_processed_/csm_D\\xc3\\xa4mpfungswanne_3_01_0c75c517e4.jpg\", \t\"urlquery\": \"\", \t\"bytes\": 2913, \t\"method\": \"GET\", \t\"referer\": \"http://www.example.org/file.html\", \t\"useragent\": \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36\" }",
:exception=>#<LogStash::Json::ParserError: Unrecognized character escape 'x' (code 120) at [Source: [B@29a15d70; line: 1, column: 231]>
:level=>:warn
}
我用一个简单的 Ruby 代码片段仔细检查了原始消息,它能够解析 JSON:
# irb
$ require 'json'
$ s= "{ \t\"@timestamp\": \"2015-08-27T12:47:02+0200\", \t\"@version\": \"1\", \t\"vhost\":\"www.example.org\", \t\"tags\":[\"apache-json\"], \"clientip\": \"127.0.0.1\", \t\"duration\": 1280, \t\"status\": 200, \t\"request\": \"/uploads/_processed_/csm_D\\xc3\\xa4mpfungswanne_3_01_0c75c517e4.jpg\", \t\"urlpath\": \"/uploads/_processed_/csm_D\\xc3\\xa4mpfungswanne_3_01_0c75c517e4.jpg\", \t\"urlquery\": \"\", \t\"bytes\": 2913, \t\"method\": \"GET\", \t\"referer\": \"http://www.example.org/file.html\", \t\"useragent\": \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36\" }"
$ JSON.parse(s)
=> {"@timestamp"=>"2015-08-27T12:47:02+0200", "@version"=>"1", "vhost"=>"www.example.org", "tags"=>["apache-json"], "clientip"=>"127.0.0.1", "duration"=>1280, "status"=>200, "request"=>"/uploads/_processed_/csm_Dxc3xa4mpfungswanne_3_01_0c75c517e4.jpg", "urlpath"=>"/uploads/_processed_/csm_Dxc3xa4mpfungswanne_3_01_0c75c517e4.jpg", "urlquery"=>"", "bytes"=>2913, "method"=>"GET", "referer"=>"http://www.example.org/file.html", "useragent"=>"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36"}
我正在运行 Logstash 1.5.2。我认为这与编解码器有关,因为我尝试codec
为 json 解析器设置参数并没有阻止问题。除了确保它使用正确的编解码器之外,Apache 还需要什么 - 我似乎找不到任何配置选项:(
欢迎任何帮助。