Rails / Sunspot / Solr: repeated indexing on inherited classes

We are creating a Ruby on Rails application that uses Solr as a search engine. The following version numbers may be relevant to the issue described in the following paragraphs:

  • Ruby: 1.9.2
  • Rails: 3.2.6
  • Sunspot: 1.3.0.rc5

Background

We have a model Feedback

that is inherited by various subclasses. The class hierarchy looks like this (unidirectional inheritance):

Feedback
  |- Problem
  |- Question
  |- Suggestion
  |- Announcement

      

In the index, Feedback

indexing is enabled with the following code:

searchable :auto_index => true, :auto_remove => true do
  string :type
  text :title, :boost => 2
  text :content
  integer :user_id
  time :created_at
  ...
end

      

Problem

The problem is that when creating, for example, a new one Problem

with the title "problemtitle", Sunspot will initiate automatic indexing for Problem

and the base one Feedback

. When searching for reviews with the title "problemtitle" with

search = Feedback.solr_search do
    with(:type, type.capitalize)
    fulltext("problemtitle") {minimum_match 1}
    paginate(page: options[:page], per_page: options[:per_page])
end

      

found two results. One of the results is Problem

and the other is Feedback

. This indicates that the class and its subclasses are indexed in the class hierarchy; which should be correct as far as I know.

The strange thing is that reindexing the index with the command bundle exec rake sunspot:solr:reindex

and searching Feedback

with the title "problemtitle" produces the same result as above Problem

.

We solved this by adding :unless => proc {|model| model.class == Feedback}

to the search definition in the model Feedback

. This ensures that only subclasses are Feedback

automatically indexed.

Question

My question is what is the desired behavior or not (is it a feature or a bug). I don't understand why reindexing treats models for indexing differently than automatic indexing at create time. Could this be a problem with how we implemented the class hierarchy?

If more information is required to answer my question, I will try to give it.

Regards,

Sebastian

+3


source to share


2 answers


We solved this problem by expanding the search block with the except statement:



searchable :auto_index => true, :auto_remove => true, 
  :unless => proc {|model| model.class == Feedback} do
    string :type
    text :title, :boost => 2
    text :content
    integer :user_id
    time :created_at
    ...
  end
end

      

+2


source


Sebastian, I believe the problem is that Sunspot creates the main Solr id using the fully qualified class name and id:

def index_id_for(class_name, id) #:nodoc:
  "#{class_name} #{id}"
end

      

So if your class is indexed as Feedback

, then again as Feedback::Problem

Solr will have two entries for it and thus return both of them when searched. Sunspot will then try to match each item to the database by pulling the same item twice. When reindexing, the entire database is discarded and each item is indexed with its current class - which is why there is only one after the reindex.



We had a similar problem and the solution was to create our own InstanceAdapter

for the STI classes and register it in the initializer:

class StiInstanceAdapter < Sunspot::Adapters::InstanceAdapter

  def id
    @instance.id
  end

  def index_id
    return Sunspot::Adapters::InstanceAdapter.index_id_for(@instance.class.base_class.name, id)
  end

end

Sunspot::Adapters::InstanceAdapter.register(StiInstanceAdapter, Feedback)

      

I know this is a little late, but hopefully it helps.

+4


source







All Articles