Elasticsearch completion - creating a list of inputs with parsers

I took a look at this article: https://www.elastic.co/blog/you-complete-me. However, creating multiple "inputs" requires writing some logic on the client. Is there a way to define a parser (perhaps using shingle or ngram / edge-ngram) that will generate multiple terms for input?

Here's what I tried (and it obviously doesn't work):

DELETE /products/
PUT /products/
{
    "settings": {
        "analysis": {
            "filter": {
                "autocomplete_filter": {
                    "type":"shingle",
                    "max_shingle_size":5,
                    "min_shingle_size":2
                }
            },
            "analyzer": {
                "autocomplete": {
                    "filter": [
                        "lowercase",
                        "autocomplete_filter"
                    ],
                    "tokenizer": "standard"
                }
            }
        }
    }, 
    "mappings": {
        "product": {
            "properties": {
                "name": {"type": "string"
                ,"copy_to": ["name_suggest"]
                }
                ,"name_suggest": {
                    "type": "completion",
                    "payloads": false,
                    "analyzer": "autocomplete"
                }
            }
        }
    }
}

PUT /products/product/1
{
    "name": "Apple iPhone 5"
}

PUT /products/product/2
{
    "name": "iPhone 4 16GB"
}

PUT /products/product/3
{
    "name": "iPhone 3 GS 16GB black"
}

PUT /products/product/4
{
    "name": "Apple iPhone 4 S 16 GB white"
}

PUT /products/product/5
{
    "name": "Apple iPhone case"
}

POST /products/_suggest
{
    "suggestions": {
        "text":"i"
        ,"completion":{
            "field": "name_suggest"
        }
    }
}

      

+5


source to share


1 answer


Don't think that there is a direct way to achieve this. I'm not sure why this is needed to store ngrammed tokens, given that elasticsearch already stores the input text as an FST structure. New releases also allow for ambiguity in the RFP. https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-completion.html#fuzzy

I can figure out that something like a pebble parser can generate input for you, but there seems to be no way out yet. _analyze

the endpoint _analyze

can be used to generate tokens from the parser of your choice, and those tokens can be passed to the input field (with or without any other logic added). Thus, you do not have to copy the analyzer logic in the application code. This is the only way I can think of to achieve the desired input field.



Hope it helps.

0


source







All Articles