How to do minus operation on timestamps in elasticsearch?
I have server logs dumping to elasticsearch. The logs contain entries such as 'action_id':'AU11nP1mYXS3pt6INMtU','action':'start','time':'March 31st 2015, 19:42:07.121'
and 'action_id':'AU11nP1mYXS3pt6INMtU','action':'complete','time':'March 31st 2015, 23:06:00.271'
. Identical action_id refers to one action and I'm wondering how long it took for the action to complete.
I don't really know the elasticsearch way to formulate my question, but I'll try my best: how to do the aggregation on "action_id" based on a custom metric defined by the time it takes to go from 'action':'start'
to 'action':'complete'
?
I use kibana
visualization if it helps.
source to share
I looked at the example provided for scripted metric aggregation and modified it for this problem:
{
"aggs": {
"actions": {
"terms": {
"field": "action_id"
},
"aggs": {
"duration": {
"scripted_metric": {
"init_script": "_agg['delta'] = 0",
"map_script": "if (doc['action'].value == \"complete\"){ _agg.delta += doc['time'].value } else {_agg.delta -= doc['time'].value}",
"combine_script": "return _agg.delta",
"reduce_script": "duration = 0; for (d in _aggs) { duration += d }; return duration"
}
}
}
}
}
}
First, it creates buckets for each action_id with term aggregation.
A labeled script is then computed for each bucket.
In a step, map
it takes the "full" timestamps as positive values ββand the others (ie "start") as negative for each shard. Then in a step, combine
it just returns them. And in a step, reduce
it adds up the durations for the action on all shards (since the "start" and "complete" events can be on different shards) to get the actual duration.
I'm not sure about the performance of this aggregation, but you can try it on your dataset. And note that this is still experimental functionality.
source to share
It looks like elasticsearch is not designed for timing. It looks like elasticsearch uses logstash to do tasks like this.
https://www.elastic.co/guide/en/logstash/current/plugins-filters-elasticsearch.html
if [action] == "complete" {
elasticsearch {
hosts => ["es-server"]
query => "action:start AND action_id:%{[action_id]}"
fields => ["time", "started"]
}
date {
match => ["[started]", "ISO8601"]
target => "[started]"
}
ruby {
code => "event['duration_hrs'] = (event['@timestamp'] - event['started']) / 3600 rescue nil"
}
}
source to share