Elasticsearch how to return unique values ​​of two fields

I have an index with 20 different fields. I need to be able to pull out unique documents where the combination of "cat" and "sub" fields is unique. In SQL it would look like this: select a unique cat, sub from table A; I can do it for one field this way:

{
"size": 0,
"aggs" : {
    "unique_set" : {
        "terms" : { "field" : "cat" }
    }
}}

      

but how to add another field to check for uniqueness in two fields?

Thank,

+3


source to share


2 answers


The only way to solve this is probably nested aggregates:



{
"size": 0,
    "aggs" : {
        "unique_set_1" : {

            "terms" : {
                     "field" : "cats"
            },
            "aggregations" : { 
                "unique_set_2": {
                    "terms": {"field": "sub"}
                }
            }
        }
    }

}

      

+1


source


Quote:

I need to be able to pull out unique documents where the combination of "cat" and "sub" fields is unique.

This is nonsense; your question is unclear. You can have unique unique three unique pairs {cat, sub} and 100s {cat, sub, field_3} and unique unique unique Docs {cat, sub, field3, field4, ...}.

If you are interested in the number of documents per unique pair {"Category X", "Subcategory Y"}, you can use cardinality aggregates. For two or more fields, you will need to use scripts that will have performance.



Example:

{
    "aggs" : {
        "multi_field_cardinality" : {
            "cardinality" : {
                "script": "doc['cats'].value + ' _my_custom_separator_ ' + doc['sub'].value"
            }
        }
    }
}

      

Alternative solution: use aggregated nested term agents.

-1


source







All Articles