Separate account is larger than doc_count in elasticsearch aggs
I wrote some aggs query to get the total (amount) and unique invoice. but the result is a little confusing.
the unique value is greater than doc_count.
Is it possible?
I know that power aggregation is experimental and can get a rough estimate of different values.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-metrics-cardinality-aggregation.html
but this is too bad a result. as you can see, there are many buckets that are more unique than doc_count.
any problem with the format of the request? or power?
half a million documents indexed
and there are 15 types of eventID
ES 1.4 using.
request
{
"size": 0,
"_source": false,
"aggs": {
"eventIds": {
"terms": {
"field": "_EventID_",
"size": 0
},
"aggs": {
"unique": {
"cardinality": {
"field": "UUID"
}
}
}
}
}
answer
{
"took": 383,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 550971,
"max_score": 0,
"hits": [
]
},
"aggregations": {
"eventIds": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "red",
"doc_count": 165110,
"unique": {
"value": 27423
}
},
{
"key": "blue",
"doc_count": 108376,
"unique": {
"value": 94775
}
},
{
"key": "yellow",
"doc_count": 78919,
"unique": {
"value": 70094
}
},
{
"key": "green",
"doc_count": 60580,
"unique": {
"value": 78945
}
},
{
"key": "black",
"doc_count": 49923,
"unique": {
"value": 56200
}
},
{
"key": "white",
"doc_count": 38744,
"unique": {
"value": 45229
}
},
EDIT. more tests
I tried again with 1000 precision_threshold which is filtered by only one eventId
but the result error is the same. expected capacity less than 30,000 but over 66,000 (this is more than the total size of the document).
doc_count: 65 672 (no problem). power: 66,037 (more than doc_count) actual power: about 23,000 (calculated by rdbms scripts ...)
request
{
"size": 0,
"_source": false,
"query": {
"term": {
"_EventID_": "packdownload"
}
},
"aggs": {
"unique": {
"cardinality": {
"field": "UUID",
"precision_threshold": 10000
}
}
}
}
answer
{
"took": 28,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 65672,
"max_score": 0,
"hits": []
},
"aggregations": {
"unique": {
"value": 66037
}
}
}
source to share