How to do minus operation on timestamps in elasticsearch?

I have server logs dumping to elasticsearch. The logs contain entries such as 'action_id':'AU11nP1mYXS3pt6INMtU','action':'start','time':'March 31st 2015, 19:42:07.121'

and 'action_id':'AU11nP1mYXS3pt6INMtU','action':'complete','time':'March 31st 2015, 23:06:00.271'

. Identical action_id refers to one action and I'm wondering how long it took for the action to complete.

I don't really know the elasticsearch way to formulate my question, but I'll try my best: how to do the aggregation on "action_id" based on a custom metric defined by the time it takes to go from 'action':'start'

to 'action':'complete'

?

I use kibana

visualization if it helps.

+3


source to share


2 answers


I looked at the example provided for scripted metric aggregation and modified it for this problem:

{
   "aggs": {
      "actions": {
         "terms": {
            "field": "action_id"
         },
         "aggs": {
            "duration": {
               "scripted_metric": {
                  "init_script": "_agg['delta'] = 0",
                  "map_script": "if (doc['action'].value == \"complete\"){ _agg.delta += doc['time'].value } else {_agg.delta -= doc['time'].value}",
                  "combine_script": "return _agg.delta",
                  "reduce_script": "duration = 0; for (d in _aggs) { duration += d }; return duration"
               }
            }
         }
      }
   }
}

      

First, it creates buckets for each action_id with term aggregation.



A labeled script is then computed for each bucket.

In a step, map

it takes the "full" timestamps as positive values ​​and the others (ie "start") as negative for each shard. Then in a step, combine

it just returns them. And in a step, reduce

it adds up the durations for the action on all shards (since the "start" and "complete" events can be on different shards) to get the actual duration.

I'm not sure about the performance of this aggregation, but you can try it on your dataset. And note that this is still experimental functionality.

0


source


It looks like elasticsearch is not designed for timing. It looks like elasticsearch uses logstash to do tasks like this.

https://www.elastic.co/guide/en/logstash/current/plugins-filters-elasticsearch.html



if [action] == "complete" {
   elasticsearch {
      hosts => ["es-server"]
      query => "action:start AND action_id:%{[action_id]}"
      fields => ["time", "started"]
   }

  date {
     match => ["[started]", "ISO8601"]
     target => "[started]"
  }

  ruby {
     code => "event['duration_hrs'] = (event['@timestamp'] - event['started']) / 3600 rescue nil"   
  }
}

      

0


source







All Articles