In ElasticSearch, how does sort interact with function_score?
See my search term below with specific questions below.
search = {
'query' : {
'function_score': {
'score_mode': 'multiply'
'functions': functions,
'query': {
'match_all':{}
},
'filter': {
'bool': {
'must': filters_include,
'must_not': filters_exclude
}
}
}
}
'sort': [{'_score': {'order': 'desc'}},
{'time': {'order': 'desc'}}]
}
where functions
looks like this:
[{'weight': 5.0, 'gauss': {'time': {'scale': '7d'}}},
{'weight': 3.0, 'script_score': {'script': "1+doc['scores.year'].value"}},
{'weight': 2.0, 'script_score': {'script': "1+doc['scores.month'].value"}}]
What happens when you run this query? Are documents written by the_score_function and then sorted after the fact with an array sort
? Now what _score
(note that the query match_all
) and does it do anything in the sort? If I canceled it and put time
before _score
in the sort, what result should I expect?
source to share
A match_all
will give the same result without function_score
, which means every doc gets 1
.
With, function_score
it will calculate all three scores (all three matches, because you don't have a filter for each function) and it will multiply them (because you have score_mode: multiply
). So, approximately you will get the final result function1_score * function2_score * function3_score
. The resulting result will be used in sorting. If some _scores are equal, then the sort is used time
.
It's best for you if you select your request from your app, but for example in JSON in the Marvel Sense panel and test it with ?explain
. This will give you detailed explanations for each point calculation.
Let me give you an example: let's say we have a document containing "year":2015,"month":7,"time":"2015-07-06"
.
Running your query with _search?explain
gives this very detailed explanation:
"hits": [
{
"_shard": 4,
"_node": "jt4AX7imTECLWH4Bofbk3g",
"_index": "test",
"_type": "test",
"_id": "3",
"_score": 26691.023,
"_source": {
"text": "whatever",
"year": 2015,
"month": 7,
"time": "2015-07-06"
},
"sort": [
26691.023,
1436140800000
],
"_explanation": {
"value": 26691.023,
"description": "function score, product of:",
"details": [
{
"value": 1,
"description": "ConstantScore(BooleanFilter(+cache(year:[1990 TO *]) -cache(month:[13 TO *]))), product of:",
"details": [
{
"value": 1,
"description": "boost"
},
{
"value": 1,
"description": "queryNorm"
}
]
},
{
"value": 26691.023,
"description": "Math.min of",
"details": [
{
"value": 26691.023,
"description": "function score, score mode [multiply]",
"details": [
{
"value": 0.2758249,
"description": "function score, product of:",
"details": [
{
"value": 1,
"description": "match filter: *:*"
},
{
"value": 0.2758249,
"description": "product of:",
"details": [
{
"value": 0.055164978,
"description": "Function for field time:",
"details": [
{
"value": 0.055164978,
"description": "exp(-0.5*pow(MIN[Math.max(Math.abs(1.4361408E12(=doc value) - 1.437377331833E12(=origin))) - 0.0(=offset), 0)],2.0)/2.63856688924644672E17)"
}
]
},
{
"value": 5,
"description": "weight"
}
]
}
]
},
{
"value": 6048,
"description": "function score, product of:",
"details": [
{
"value": 1,
"description": "match filter: *:*"
},
{
"value": 6048,
"description": "product of:",
"details": [
{
"value": 2016,
"description": "script score function, computed with script:\"1+doc['year'].value",
"details": [
{
"value": 1,
"description": "_score: ",
"details": [
{
"value": 1,
"description": "ConstantScore(BooleanFilter(+cache(year:[1990 TO *]) -cache(month:[13 TO *]))), product of:",
"details": [
{
"value": 1,
"description": "boost"
},
{
"value": 1,
"description": "queryNorm"
}
]
}
]
}
]
},
{
"value": 3,
"description": "weight"
}
]
}
]
},
{
"value": 16,
"description": "function score, product of:",
"details": [
{
"value": 1,
"description": "match filter: *:*"
},
{
"value": 16,
"description": "product of:",
"details": [
{
"value": 8,
"description": "script score function, computed with script:\"1+doc['month'].value",
"details": [
{
"value": 1,
"description": "_score: ",
"details": [
{
"value": 1,
"description": "ConstantScore(BooleanFilter(+cache(year:[1990 TO *]) -cache(month:[13 TO *]))), product of:",
"details": [
{
"value": 1,
"description": "boost"
},
{
"value": 1,
"description": "queryNorm"
}
]
}
]
}
]
},
{
"value": 2,
"description": "weight"
}
]
}
]
}
]
},
{
"value": 3.4028235e+38,
"description": "maxBoost"
}
]
},
{
"value": 1,
"description": "queryBoost"
}
]
}
}
So, for the gauss
calculated result is 0.0516164978. I don't know how relevant this is to your question, but let me assume the calculation is correct :-). Your function gauss
weight
is 5, so the score becomes 5 * 0.055164978 = 0.27582489.
For the function, script
year
we have (1 + 2015) * 3 = 6048.
For the function, script
month
we have (1 + 7) * 2 = 16.
The total multiply
total score for this document is 0.27582489 * 6048 * 16 = 26691.023
There is also a section for each document that shows which values ββwere used for sorting. In this case:
"sort": [
26691.023,
1436140800000
]
The first number is the _score
calculated as shown, the second is the millisecond representation of the date 2015-07-06
.
source to share