How do lucene phrases work without position index and offset?

Lucene allows you to index terms with position and offsets, but even without it, phrase search can be used. So how can lucene calculate the word order in the index without this information?


source to share

1 answer

You may be confusing the position / offset of the terms with inverted index.

Terminators are not used for searches.

To exclude proximity information in actual transaction lists: use IndexOptions.DOCS_ONLY or IndexOptions.DOCS_AND_FREQS. If you do this, PhraseQueries will not work.

But if you are willing to accept some inaccuracies, these settings can be useful in conjunction with word-ngram (shinglefilters), for a quick phrase "approximation" ... and of course, they are useful for fields where proximity simply does not apply: such as numeric fields, unique id fields, etc.



All Articles