Solr, how can I get the original term first than the original version?
I am trying to get the exact key result first from the Solr 5.0.0 result.
For example,
- Meditation bowls
- Goddess Bowles
- Heavenly bowls
- Bowling green
- 33 bowls Tibetan chants
- Rebirth of a dusty bowl
- Bowl of stars
If I search for a word bowl
, the expected results are:
- Rebirth of a dusty bowl
- Bowl of stars
- Meditation bowls
- Goddess Bowles
- Heavenly bowls
- Bowling green
- 33 bowls Tibetan chants
The exact word containing the results shoud comes first. My schematic is shown below:
<fieldType name="text_wslc" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1"
generateNumberParts="1"
catenateWords="1"
catenateNumbers="1"
catenateAll="1"
preserveOriginal="1"
/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.KeywordRepeatFilterFactory"/>
<filter class="solr.PorterStemFilterFactory"/>
<filter class="solr.KStemFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1"
generateNumberParts="1"
catenateWords="1"
catenateNumbers="1"
catenateAll="1"
preserveOriginal="1"
/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.KeywordRepeatFilterFactory"/>
<filter class="solr.PorterStemFilterFactory"/>
<filter class="solr.KStemFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
I've seen usage KeywordRepeatFilterFactory
give the exact consistent and then the version. But that doesn't work for me.
You can add another field to schema.xml . This one will contain a copy of the original field:
<field name="title" type="text_wslc" indexed="true" stored="true"/>
<field name="titleExact" type="text_wslcExact" indexed="true" stored="true"/>
<copyField source="title" dest="titleExact"/>
Where text_wslcExact
something like this:
<fieldType name="textExact" class="solr.TextField" positionIncrementGap="100" >
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LimitTokenCountFilterFactory" maxTokenCount="20"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LimitTokenCountFilterFactory" maxTokenCount="20"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
The next thing to do is add (and enlarge) this new field to your query. So, in your solrconfig.xml file , try something like this:
<str name="qf">title titleExact^10</str>
<str name="pf">title^10 titleExact^100</str>
Here is my source where you can get all the explanations.