Parsing json in a list in logstash

I have json in the form

[
    {
        "foo":"bar"
    }
]

      

I am trying to filter it using a json filter in logstash. But it doesn't seem to work. I found that I cannot parse the json list using the json filter in logstash. Can someone please tell me about any workaround for this?

UPDATE

My magazines

IP - - 0.000 0.000 [24/May/2015:06:51:13 +0000] *"POST /c.gif HTTP/1.1"* 200 4 * user_id=UserID&package_name=SomePackageName&model=Titanium+S202&country_code=in&android_id=AndroidID&eT=1432450271859&eTz=GMT%2B05%3A30&events=%5B%7B%22eV%22%3A%22com.olx.southasia%22%2C%22eC%22%3A%22appUpdate%22%2C%22eA%22%3A%22app_activated%22%2C%22eTz%22%3A%22GMT%2B05%3A30%22%2C%22eT%22%3A%221432386324909%22%2C%22eL%22%3A%22packageName%22%7D%5D * "-" "-" "-"

      

Decrypted version of the URL of the above log

IP - - 0.000 0.000 [24/May/2015:06:51:13  0000] *"POST /c.gif HTTP/1.1"* 200 4 * user_id=UserID&package_name=SomePackageName&model=Titanium S202&country_code=in&android_id=AndroidID&eT=1432450271859&eTz=GMT+05:30&events=[{"eV":"com.olx.southasia","eC":"appUpdate","eA":"app_activated","eTz":"GMT+05:30","eT":"1432386324909","eL":"packageName"}] * "-" "-" "-"

      

Below is my config file for the specified logs.

filter {

urldecode{
    field => "message"
}
 grok {
  match => ["message",'%{IP:clientip}%{GREEDYDATA} \[%{GREEDYDATA:timestamp}\] \*"%{WORD:method}%{GREEDYDATA}']
}

kv {
    field_split => "&? "
}
json{
    source=> "events"
}
geoip {
    source => "clientip"
}

      

}

I need to parse events i.e. events=[{"eV":"com.olx.southasia","eC":"appUpdate","eA":"app_activated","eTz":"GMT+05:30","eT":"1432386324909","eL":"packageName"}]

+3


source to share


1 answer


I am assuming you have json in the file. You are correct, you cannot use json filter directly. You will have to use multi-line codec and use json filter afterwards.

The following configuration works for this input. However, you might have to change it to properly separate your events. It depends on your needs and the json format of your file.

Logstash config:

input     {   
    file     {
        codec => multiline
        {
            pattern => "^\]" # Change to separate events
            negate => true
            what => previous               
        }
        path => ["/absolute/path/to/your/json/file"]
        start_position => "beginning"
        sincedb_path => "/dev/null" # This is just for testing
    }
}

filter     {
    mutate   {
            gsub => [ "message","\[",""]
            gsub => [ "message","\n",""]
        }
    json { source => message }
}

      


UPDATE

After your update, I think I found the problem. Apparently you are getting jsonparsefailure because of the square brackets. As a workaround, you can manually remove them. Add the following mutation filter after your kv and before your json filter:

mutate  {
    gsub => [ "events","\]",""]
    gsub => [ "events","\[",""]
}

      


UPDATE 2

It's good if your input looks like this:

[{"foo":"bar"},{"foo":"bar1"}]

      

Here are 4 options:

Option a) ugly gsub

An ugly workaround would be another gsub:

gsub => [ "event","\},\{",","]

      

But it will eliminate the internal relationship, so I guess you don't want to do that.



Option b) split

A better approach might be to use a split filter:

split {
    field => "event"
    terminator => ","
}
mutate  {
    gsub => [ "event","\]",""]
    gsub => [ "event","\[",""]
   }
json{
    source=> "event"
}

      

This will create several events. (First with foo = bar

, and the second with foo1 = bar1

.)

Option c) mutate split

You might want to have all the values ​​in one logstash event. You can use mutate => split filter to generate array and parse json if entry exists. Unfortunately, you will need to set a conditional expression for each entry because logstash does not support loops in its configuration.

mutate  {
    gsub => [ "event","\]",""]
    gsub => [ "event","\[",""]
    split => [ "event", "," ]
   }

json{
    source=> "event[0]"
    target => "result[0]"
}

if 'event[1]' {
    json{
        source=> "event[1]"
        target => "result[1]"
    }
    if 'event[2]' {
        json{
            source=> "event[2]"
            target => "result[2]"
        }
    }
    # You would have to specify more conditionals if you expect even more dictionaries
}

      

Option d) Ruby

As per your comment, I tried to find the ruby ​​path. The following works (after your kv filter):

mutate  {
    gsub => [ "event","\]",""]
    gsub => [ "event","\[",""]
}

ruby  {
    init => "require 'json'"
    code => "
        e = event['event'].split(',')
        ary = Array.new
        e.each do |x|
            hash = JSON.parse(x)
            hash.each do |key, value|
                ary.push( { key =>  value } )
            end
        end
        event['result'] = ary
    "
}

      

Option e) Ruby

Use this approach after the kv filter (without installing the mutat filter):

ruby  {
    init => "require 'json'"
    code => "
            event['result'] = JSON.parse(event['event'])
    "
}

      

It will analyze events like event=[{"name":"Alex","address":"NewYork"},{"name":"David","address":"NewJersey"}]

in

"result" => [
    [0] {
           "name" => "Alex",
        "address" => "NewYork"
    },
    [1] {
           "name" => "David",
        "address" => "NewJersey"
    }

      

Since kv filter behavior does not support whitespace. Hope you don't have any real inputs, do you?

+5


source







All Articles