In elasticsearch, is there a way to reduce the importance of a set of search terms?

Ideally, I would like to reduce the importance of certain words like "shop", "shop", "restaurant".

I wish "Jimmy Steak Restaurant" was as important as "Ralph Steak House" when a user searches for "Steak Restaurant". I hope to achieve this by drastically reducing the meaning of the word "restaurant" (along with 20-50 in other words).

Stop words work well for some words like "a", "the", "of", etc., but they are all or nothing.

Is there a way to ensure that the value is weighted or boosted for each word at the index or collation level?

Perhaps I can accomplish this at the query level, but it can be very bad if I have 50 words that I need to reduce the impact of.

This was a generalized example. In my real end-to-end solution, I really need to reduce the impact of quite a few searches.

+3


source to share


1 answer


I do not believe that it is possible to specify term leveling when indexing. In this thread , Shay mentions that it is possible in Lucene, but that it is a complex function to perform through the API.

Another relevant thread offering the same. Shay recommends trying to sort it using the custom_score request:

I think you should try and solve it on the search side first. If you know the search scale, you can either build a query that applies different boosts depending on the tag, or use the custom_score query.

The Custom_score query is slower than other queries, but I suggest you run and check if it works for you (with actual data and the corresponding index size). The good thing is, if its slow for you (and slow here means both latency and QPS under load), you can always add more replicas and more machines to share the load.



Below is an example of a custom_score query that supports somewhat similar term level (except for a custom field that has only one category, so this may not apply). It might be easier to break the script into a native script instead of using mvel, since you will have a large wordlist.

Alternatively, perhaps add a synonym token filter that replaces words like "store", "restaurant", "store", etc.?

+4


source







All Articles