Elastic search query that returns incorrect results
I am doing a search query on the ElasticSearch index which gives me strange results. I want to find all documents where product.id = 209349
:
{
"index": "products",
"from": 0,
"size": 100,
"body": {
"query": {
"filtered": {
"filter": [
{
"term": {
"product.id": 209349
}
}
]
}
}
}
}
However, the results return me a document where product.id = 83875
. I noticed that it product.variant.id
is 209349
... What's going on here?
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "products",
"_type": "product",
"_id": "UPC-83875",
"_score": 1,
"_source": {
"mpn": "UPC-83875",
"product_count": 1,
"price": "448.00",
"price_discount_amount": null,
"product": [
{
"id": 83875,
"posted_on": "2014-07-23 22:08:36",
"status_id": 3,
"sku": "23469984",
"mpn": "UPC-83875",
"name": "Laser Toner Cartridge Set Black Cyan Yellow Magenta",
"description": "",
"has_image": true,
"currency_id": 1,
"price": "448.00",
"variant": [
{
"id": 209349,
"sku": "23469984",
"name": null,
"price": "448.00",
"discount_amount": null,
"price_total": "448.00",
"has_image": false
}
]
}
]
}
}
]
}
}
Here's the schema:
{
"dynamic": "strict",
"properties": {
"mpn": {
"type": "string",
"index": "not_analyzed"
},
"price": {
"type": "double"
},
"price_discount_amount": {
"type": "double"
},
"product": {
"properties": {
"id": {
"type": "long"
},
"posted_on": {
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss"
},
"status_id": {
"type": "long"
},
"sku": {
"type": "string",
"index": "not_analyzed"
},
"mpn": {
"type": "string",
"index": "not_analyzed"
},
"name": {
"type": "string",
"analyzer": "english"
},
"description": {
"type": "string",
"analyzer": "english"
},
"has_image": {
"type": "boolean"
},
"price": {
"type": "double"
},
"price_discount_amount": {
"type": "double"
},
"currency_id": {
"type": "long"
},
"variant": {
"properties": {
"id": {
"type": "long"
},
"sku": {
"type": "string",
"index": "not_analyzed"
},
"name": {
"type": "string",
"analyzer": "english"
},
"discount_amount": {
"type": "double"
},
"price": {
"type": "double"
},
"price_total": {
"type": "double"
},
"has_image": {
"type": "boolean"
}
}
}
}
}
}
}
Change / solve. It looks like the problem was that my index _type
has a name product
as well as an internal object. So what was happening was mapped to ES <_type>.id
(sort of how to do *.id
) and it will match variant.id
because this is a field id
.
The correct name to use appears product.product.id
to end with the full path.
This behavior seems to have pleased some users for a long time: https://github.com/elastic/elasticsearch/issues/3005
source to share
I'm not sure why you got this result (it was not returned when I tried it), but you can try to match this using nested types :
PUT /test_index
{
"mappings": {
"doc": {
"properties": {
"mpn": {
"type": "string"
},
"price": {
"type": "string"
},
"product_count": {
"type": "long"
},
"product": {
"type": "nested",
"properties": {
"currency_id": {
"type": "long"
},
"description": {
"type": "string"
},
"has_image": {
"type": "boolean"
},
"id": {
"type": "long"
},
"mpn": {
"type": "string"
},
"name": {
"type": "string"
},
"posted_on": {
"type": "string"
},
"price": {
"type": "string"
},
"sku": {
"type": "string"
},
"status_id": {
"type": "long"
},
"variant": {
"type": "nested",
"properties": {
"has_image": {
"type": "boolean"
},
"id": {
"type": "long"
},
"price": {
"type": "string"
},
"price_total": {
"type": "string"
},
"sku": {
"type": "string"
}
}
}
}
}
}
}
}
}
When I created an index using this mapping and indexed your document:
PUT /test_index/doc/UPC-83875
{
"mpn": "UPC-83875",
"product_count": 1,
"price": "448.00",
"price_discount_amount": null,
"product": [
{
"id": 83875,
"posted_on": "2014-07-23 22:08:36",
"status_id": 3,
"sku": "23469984",
"mpn": "UPC-83875",
"name": "Laser Toner Cartridge Set Black Cyan Yellow Magenta",
"description": "",
"has_image": true,
"currency_id": 1,
"price": "448.00",
"variant": [
{
"id": 209349,
"sku": "23469984",
"name": null,
"price": "448.00",
"discount_amount": null,
"price_total": "448.00",
"has_image": false
}
]
}
]
}
Then a nested filter was used in my query , it was not returned:
POST /test_index/_search
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"nested": {
"path": "product",
"filter": {
"term": {
"product.id": 209349
}
}
}
}
}
}
}
...
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
However, it was returned by this query:
POST /test_index/_search
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"nested": {
"path": "product",
"filter": {
"term": {
"product.id": 83875
}
}
}
}
}
}
}
and this one:
POST /test_index/_search
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"nested": {
"path": "product.variant",
"filter": {
"term": {
"product.variant.id": 209349
}
}
}
}
}
}
}
Here's the code I used for testing:
http://sense.qbox.io/gist/8f5fa6f2ced088a42b92f495c0668024b9ef19c8
source to share