Is there a way to optimize the keyword autocomplete query in mongo

I am creating an autocomplete service against my mongo db post collection where the user can start typing the post title. It should return all messages with this keyword in the header and sort by a field called "total".

I have a field called "lower" which is the lower case version of the header field we want to find and has an index on it. Since I am looking for any keyword match, I am doing a regular expression search against lowerCaseTitle for words that appear anywhere in the title, not just at the beginning.

I looked at the execution plan and it looks like it is looking at each item (the full message collection contains 10061 items). I've tried hinting at both index "lower_1" and index "total_-1" and they seem to be similar, but the overall index looks better with a lower nscanned number if I set the number to a limit of 50. Is there something what can i do to optimize? I can't think of anything simple off the top of my head for such a full text search.

"cursor" : "BtreeCursor lower_1",
    "nscanned" : 10061,
    "nscannedObjects" : 2,
    "n" : 2,
    "scanAndOrder" : true,
    "millis" : 154,
    "nYields" : 0,
    "nChunkSkips" : 0,
    "isMultiKey" : false,
    "indexOnly" : false,
    "indexBounds" : {
        "lower" : [
            [
                "",
                {

                }
            ]
        ]
    }

      

+3


source to share


1 answer


I am doing a regex search against lowerCaseTitle for words that appear anywhere in the title, not just at the beginning.

From the documentation :

For simple prefixed queries (also called root regular expressions), such as /^prefix/,

, the database will use the index when available and appropriate (like most SQL databases that use indexes for expression LIKE 'prefix%'

). This only works if there are no flags i

(case insensitive).

In other words, no, MongoDB does not support search substrings in a swift manner.



However, to support words starting with a given string, you can do the following:

BlogPost { 
  Title : "This is furiously interesting post"
  TitleSearch : [ "this", "is", "furiously", "interesting", "post" ]
}

      

Now, with indexing TitleSearch

and using the root regexp, the search 'inter'

will return a pattern, also for 'furious'

, but not for 'eresting'

.

+6


source







All Articles