Mongodb query does not use prefix for compound index with textfield

I created the following index on my collection:

db.myCollection.createIndex({
  user_id: 1,
  name: 'text'
})

      

If I try to see the execution plan of a query containing both fields, for example:

db.getCollection('campaigns').find({ 
    user_id: ObjectId('xxx')
   ,$text: { $search: 'bla' } 
}).explain('executionStats')

      

I get the following results:

...
"winningPlan" : {
    "stage" : "TEXT",
    "indexPrefix" : {
        "user_id" : ObjectId("xxx")
    },
    "indexName" : "user_id_1_name_text",
    "parsedTextQuery" : {
        "terms" : [ 
            "e"
        ],
        "negatedTerms" : [],
        "phrases" : [],
        "negatedPhrases" : []
    },
    "inputStage" : {
        "stage" : "TEXT_MATCH",
        "inputStage" : {
            "stage" : "TEXT_OR",
            "inputStage" : {
                "stage" : "IXSCAN",
                "keyPattern" : {
                    "user_id" : 1.0,
                    "_fts" : "text",
                    "_ftsx" : 1
                },
                "indexName" : "user_id_1_name_text",
                "isMultiKey" : true,
                "isUnique" : false,
                "isSparse" : false,
                "isPartial" : false,
                "indexVersion" : 1,
                "direction" : "backward",
                "indexBounds" : {}
            }
        }
    }
}
...

      

As stated in the documentation , MongoDB can use index prefixes to perform indexed queries.

Since user_id

is a prefix for the above index, I would expect a query with only to user_id

use the index, but if I try the following:

db.myCollection.find({ 
    user_id: ObjectId('xxx')
}).explain('executionStats')

      

I get:

...
"winningPlan" : {
    "stage" : "COLLSCAN",
    "filter" : {
        "user_id" : {
            "$eq" : ObjectId("xxx")
        }
    },
    "direction" : "forward"
},
...

      

So it doesn't use the index at all and does a full scan of the collection.

+3


source to share


2 answers


In general MongoDB can use index prefixes to support queries, however complex indexes including geospatial or text fields are a special case of rare composite indexes.If a document does not contain a value for any text field (s) in a composite index, it will not be included in the index ...

To provide correct results for a prefix search, an alternate query plan will be selected over a sparse composite index:

If a sparse index results in an incomplete result set for queries and sorts, MongoDB will not use that index unless hint () explicitly specifies an index.

Setting up some test data in MongoDB 3.4.5 to demonstrate a potential problem:

db.myCollection.createIndex({ user_id:1, name: 'text' }, { name: 'myIndex'})

// `name` is a string; this document will be included in a text index
db.myCollection.insert({ user_id:123, name:'Banana' })

// `name` is a number; this document will NOT be included in a text index
db.myCollection.insert({ user_id:123, name: 456 })

// `name` is missing; this document will NOT be included in a text index
db.myCollection.insert({ user_id:123 })

      



Then, forcing the use of a compound text index:

db.myCollection.find({user_id:123}).hint('myIndex')

      

The result only includes one document with an indexed text field name

, not the three documents expected:

{
  "_id": ObjectId("595ab19e799060aee88cb035"),
  "user_id": 123,
  "name": "Banana"
}

      

This exception should be more clearly stated in the MongoDB documentation; watch / upvote DOCS-10322 in MongoDB error tracking log for updates.

+3


source


This is because text indexes are sparse by default :

For a composite index that includes a text index key along with other key types, only the text index field determines whether the index refers to a document. The other keys do not determine whether the index refers to documents or not.



The query filter does not refer to a text index field, so the query planner will not consider that index because it cannot be sure that the index will return the full result set of documents.

+1


source







All Articles