Solr, how can I get the original term first than the original version?

I am trying to get the exact key result first from the Solr 5.0.0 result.

For example,

  • Meditation bowls
  • Goddess Bowles
  • Heavenly bowls
  • Bowling green
  • 33 bowls Tibetan chants
  • Rebirth of a dusty bowl
  • Bowl of stars

If I search for a word bowl

, the expected results are:

  • Rebirth of a dusty bowl
  • Bowl of stars
  • Meditation bowls
  • Goddess Bowles
  • Heavenly bowls
  • Bowling green
  • 33 bowls Tibetan chants

The exact word containing the results shoud comes first. My schematic is shown below:

 <fieldType name="text_wslc" class="solr.TextField" positionIncrementGap="100">
   <analyzer type="index">
     <tokenizer class="solr.WhitespaceTokenizerFactory"/>
     <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
     <filter class="solr.WordDelimiterFilterFactory"
                             generateWordParts="1"
                             generateNumberParts="1"
                             catenateWords="1"
                             catenateNumbers="1"
                             catenateAll="1"
                             preserveOriginal="1"
                             />
     <filter class="solr.LowerCaseFilterFactory"/>
     <filter class="solr.KeywordRepeatFilterFactory"/>
     <filter class="solr.PorterStemFilterFactory"/>
     <filter class="solr.KStemFilterFactory"/>
     <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
   </analyzer>
   <analyzer type="query">
     <tokenizer class="solr.WhitespaceTokenizerFactory"/>
     <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
     <filter class="solr.WordDelimiterFilterFactory"
                             generateWordParts="1"
                             generateNumberParts="1"
                             catenateWords="1"
                             catenateNumbers="1"
                             catenateAll="1"
                             preserveOriginal="1"
                             />
     <filter class="solr.LowerCaseFilterFactory"/>
     <filter class="solr.KeywordRepeatFilterFactory"/>
     <filter class="solr.PorterStemFilterFactory"/>
     <filter class="solr.KStemFilterFactory"/>
     <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
   </analyzer>
 </fieldType>

      

I've seen usage KeywordRepeatFilterFactory

give the exact consistent and then the version. But that doesn't work for me.

+3
java lucene solr solrj stemming


source to share


1 answer


You can add another field to schema.xml . This one will contain a copy of the original field:

<field name="title" type="text_wslc" indexed="true" stored="true"/>
<field name="titleExact" type="text_wslcExact" indexed="true" stored="true"/>
<copyField source="title" dest="titleExact"/>

      

Where text_wslcExact

something like this:

<fieldType name="textExact" class="solr.TextField" positionIncrementGap="100" >
  <analyzer type="index">
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.LimitTokenCountFilterFactory" maxTokenCount="20"/>
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.LimitTokenCountFilterFactory" maxTokenCount="20"/>
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>

      



The next thing to do is add (and enlarge) this new field to your query. So, in your solrconfig.xml file , try something like this:

<str name="qf">title titleExact^10</str>
<str name="pf">title^10 titleExact^100</str>

      

Here is my source where you can get all the explanations.

+4


source to share







All Articles
Loading...
X
Show
Funny
Dev
Pics