5

我有一个json形式的

[
    {
        "foo":"bar"
    }
]

我正在尝试使用 logstash 中的 json 过滤器对其进行过滤。但这似乎不起作用。我发现我无法使用 logstash 中的 json 过滤器解析列表 json。有人可以告诉我任何解决方法吗?

更新

我的日志

IP - - 0.000 0.000 [24/May/2015:06:51:13 +0000] *"POST /c.gif HTTP/1.1"* 200 4 * user_id=UserID&package_name=SomePackageName&model=Titanium+S202&country_code=in&android_id=AndroidID&eT=1432450271859&eTz=GMT%2B05%3A30&events=%5B%7B%22eV%22%3A%22com.olx.southasia%22%2C%22eC%22%3A%22appUpdate%22%2C%22eA%22%3A%22app_activated%22%2C%22eTz%22%3A%22GMT%2B05%3A30%22%2C%22eT%22%3A%221432386324909%22%2C%22eL%22%3A%22packageName%22%7D%5D * "-" "-" "-"

上述日志的 URL 解码版本是

IP - - 0.000 0.000 [24/May/2015:06:51:13  0000] *"POST /c.gif HTTP/1.1"* 200 4 * user_id=UserID&package_name=SomePackageName&model=Titanium S202&country_code=in&android_id=AndroidID&eT=1432450271859&eTz=GMT+05:30&events=[{"eV":"com.olx.southasia","eC":"appUpdate","eA":"app_activated","eTz":"GMT+05:30","eT":"1432386324909","eL":"packageName"}] * "-" "-" "-"

请在我的配置文件下面找到上述日志..

筛选 {

urldecode{
    field => "message"
}
 grok {
  match => ["message",'%{IP:clientip}%{GREEDYDATA} \[%{GREEDYDATA:timestamp}\] \*"%{WORD:method}%{GREEDYDATA}']
}

kv {
    field_split => "&? "
}
json{
    source=> "events"
}
geoip {
    source => "clientip"
}

}

我需要解析事件,即events=[{"eV":"com.olx.southasia","eC":"appUpdate","eA":"app_activated","eTz":"GMT+05:30","eT":"1432386324909","eL":"packageName"}]

4

1 回答 1

8

我假设你有你的 json 在一个文件中。你是对的,你不能直接使用 json 过滤器。您必须使用多行编解码器,然后使用 json 过滤器。

以下配置适用于您的给定输入。但是,您可能必须更改它才能正确分离您的事件。这取决于您的需求和文件的 json 格式。

Logstash 配置:

input     {   
    file     {
        codec => multiline
        {
            pattern => "^\]" # Change to separate events
            negate => true
            what => previous               
        }
        path => ["/absolute/path/to/your/json/file"]
        start_position => "beginning"
        sincedb_path => "/dev/null" # This is just for testing
    }
}

filter     {
    mutate   {
            gsub => [ "message","\[",""]
            gsub => [ "message","\n",""]
        }
    json { source => message }
}

更新

在您更新之后,我想我已经找到了问题所在。显然你得到一个jsonparsefailure因为方括号。作为一种解决方法,您可以手动删除它们。在您的 kv 和 json 过滤器之前添加以下 mutate 过滤器:

mutate  {
    gsub => [ "events","\]",""]
    gsub => [ "events","\[",""]
}

更新 2

好吧,假设您的输入如下所示:

[{"foo":"bar"},{"foo":"bar1"}]

这里有 4 个选项:

选项 a) 丑陋的 gsub

一个丑陋的解决方法是另一个 gsub:

gsub => [ "event","\},\{",","]

但这会消除内部关系,所以我想您不想这样做。

选项 b) 拆分

更好的方法可能是使用拆分过滤器:

split {
    field => "event"
    terminator => ","
}
mutate  {
    gsub => [ "event","\]",""]
    gsub => [ "event","\[",""]
   }
json{
    source=> "event"
}

这将产生多个事件。(第一个foo = bar和第二个foo1 = bar1。)

选项 c) 变异分裂

您可能希望在一个 logstash 事件中包含所有值。如果条目存在,您可以使用mutate => split过滤器生成一个数组并解析 json。不幸的是,您必须为每个条目设置一个条件,因为 logstash 不支持其配置中的循环。

mutate  {
    gsub => [ "event","\]",""]
    gsub => [ "event","\[",""]
    split => [ "event", "," ]
   }

json{
    source=> "event[0]"
    target => "result[0]"
}

if 'event[1]' {
    json{
        source=> "event[1]"
        target => "result[1]"
    }
    if 'event[2]' {
        json{
            source=> "event[2]"
            target => "result[2]"
        }
    }
    # You would have to specify more conditionals if you expect even more dictionaries
}

选项 d) 红宝石

根据您的评论,我试图找到一种红宝石方式。以下作品(在您的 kv 过滤器之后):

mutate  {
    gsub => [ "event","\]",""]
    gsub => [ "event","\[",""]
}

ruby  {
    init => "require 'json'"
    code => "
        e = event['event'].split(',')
        ary = Array.new
        e.each do |x|
            hash = JSON.parse(x)
            hash.each do |key, value|
                ary.push( { key =>  value } )
            end
        end
        event['result'] = ary
    "
}

选项 e) 红宝石

在您的 kv 过滤器之后使用此方法(不设置 mutate 过滤器):

ruby  {
    init => "require 'json'"
    code => "
            event['result'] = JSON.parse(event['event'])
    "
}

它将解析事件,例如event=[{"name":"Alex","address":"NewYork"},{"name":"David","address":"NewJersey"}]

进入:

"result" => [
    [0] {
           "name" => "Alex",
        "address" => "NewYork"
    },
    [1] {
           "name" => "David",
        "address" => "NewJersey"
    }

由于 kv 过滤器的行为,这不支持空格。我希望你没有任何真正的投入,是吗?

于 2015-08-04T07:51:36.373 回答