ElasticSerach - Statistical Aspects By List Length
I have the following example of mappipng:
{
"book": {
"properties": {
"author": {"type": "string"},
"title": {"type": "string"},
"reviews": {
"properties": {
"url": {"type": "string"},
"score": {"type": "integer"}
}
},
"chapters": {
"include_in_root": 1,
"type": "nested",
"properties": {
"name": {"type": "string"}
}
}
}
}
}
I would like to get a facet by the number of reviews - i.e. the length of the reviews array. For example, the results spoken to me orally: "100 documents with 10 reviews, 20 documents with 5 reviews, ..."
I am trying to do the following statistical aspect:
{
"query": {
"match_all": {}
},
"facets": {
"stat1": {
"statistical": {"script": "doc ['reviews.score']. values.size ()"}
}
}
}
but it keeps failing:
{
"error": "SearchPhaseExecutionException [Failed to execute phase [query_fetch], total failure; shardFailures {[mDsNfjLhRIyPObaOcxQo2w] [facettest] [0]: QueryPhaseExecutionException [[facettest] [0]: cache [ConstantseScore ( NotDearchted .index.search.nested.NonNestedDocsFilter @ a2a598 4b)))], from [0], size [10]: Query Failed [Failed to execute main query]]; nested: PropertyAccessException [[Error: could not access: reviews; in class: org.elasticsearch.search.lookup.DocLookup]
[Near: {... doc [reviews.score] .values.size () ....}]
^
[Line: 1, Column: 5]]; }] ",
"status": 500
}
How can I achieve my goal?
ElasticSearch version is 0.19.9.
Here are my details:
{
"author": "Mark Twain",
"title": "The Adventures of Tom Sawyer",
"reviews": [
{
"url": "amazon.com",
"score": 10
},
{
"url": "www.barnesandnoble.com",
"score": 9
}
],
"chapters": [
{"name": "Chapter 1"}, {"name": "Chapter 2"}
]
}
{
"author": "Jack London",
"title": "The Call of the Wild",
"reviews": [
{
"url": "amazon.com",
"score": 8
},
{
"url": "www.barnesandnoble.com",
"score": 9
},
{
"url": "www.books.com",
"score": 5
}
],
"chapters": [
{"name": "Chapter 1"}, {"name": "Chapter 2"}
]
}
source to share
It looks like you are using curl to fulfill your request and this curl statement looks like curl localhost: 9200 / my-index / book -d '{....}'
The problem is that since you are using apostrophes to wrap the request body, you need to escape all the apostrophes it contains. So your script should become:
{"script" : "doc['\''reviews.score'\''].values.size()"}
or
{"script" : "doc[\"reviews.score"].values.size()"}
The second problem is that from your description it looks like you are looking for a histogram facet or area facet , but not for the statistical aspect. So, I would suggest trying something like this:
curl "localhost:9200/test-idx/book/_search?search_type=count&pretty" -d '{
"query" : {
"match_all" : {}
},
"facets" : {
"histo1" : {
"histogram" : {
"key_script" : "doc[\"reviews.score\"].values.size()",
"value_script" : "doc[\"reviews.score\"].values.size()",
"interval" : 1
}
}
}
}'
The third problem is that the script on the facet will be called for every single entry in the result list, and if you have a lot of results it can take a very long time. Therefore, I would suggest indexing an additional field under the title number_of_reviews
that should be populated with the number of reviews from your customer. Then your request would simply become:
curl "localhost:9200/test-idx/book/_search?search_type=count&pretty" -d '{
"query" : {
"match_all" : {}
},
"facets" : {
"histo1" : {
"histogram" : {
"field" : "number_of_reviews"
"interval" : 1
}
}
}
}'
source to share