Sorting solr results based on user click?

I am facing the problem of sorting Solr results based on users' click log. I would like to see results with more accessible results first. Does anyone know how to set up or implement such a property in Solr?

Many thanks.


source to share

1 answer

Good question. Your problem can be seen as a classic problem of collective intelligence or crowd wisdom. The first step is to count the url for a specific request, that is, for each request, the peer pair will have a counter maintained for that tuple. Every time a user clicks on a specific URL, the score increases by 1. As a second step, Solr will return results to you based on its ranking and relevance Algorithms (say LCS, Vector Space, etc.) for each request, the url pair returned you a frame with a formula that adds a specific value (based on the number of clicks) to the rank given by Solr for the document, and then you need to display the results based on the overall rank.

Overall rating for the document = Solr + Click Ranking The numerical value you specify.

For example, when you search for "iphone plan", Solr returns you the following links in order of high rank to low:

  • Apple
  • AT&T
  • Amazon

Now you check every request, url, i.e. {"iphone plan", Apple} {"iphone plan", AT&T} {"iphone plan", Amazon} click rate, and you will find out that the number of clicks for a request is the highest for AT&T compared to Apple. Using your custom formulas and giving some click weight, you go to the above and change the display order.

Note, however, that the formulas you develop should not be good for spammers who can change your entire site ranking by having huge clicks for a certain document (say using a robot :))

Above was the logic. There are now two ways to implement the above:

  • Modify the Lucene affinity class ( i.e. first understand how Lucene does the rating and then injects your module into that

  • Implement it as a separate routine on top of Solr.

Note. Remember that getting the counts for a query, url pairs is not easy if you have huge / big data, in which case you will need to write some map shrink jobs to get this done.



All Articles