How to filter the "Hits" result returned by the indexsearcher.search () function?

How can I reduce the size of the Hits object returned by the indexsearcher.search () function?

I am currently doing something like:

Hits hits = indexSearch.search(query,filter,...);

Iterator hitsIt = hits.iterator();
int newSize=0;
while (hitsIt.hasNext()){
   Hit currHit = (Hit)hitsIt.next();

   if (hasPermission(currHit)){
      newSize++;
   }
}

      

However, this poses a huge performance problem when the number of hits is high (eg 500 or more).

I've heard of something called "HitsCollector" or possibly "Collector" that should help improve performance, but I don't know how to use it.

Would be grateful if someone could point me in the right direction.

We are using Apache Lucene for indexing in Atlassian Confluence web app.

+3


source to share


2 answers


A Collector is just a simple callback mechanism that gets called for every hit in the document, you would use a collector like this: -



public class MyCollector extends HitCollector {

// this is called back for every document that 
// matches, with the docid and the score

public void collect(int doc, float score){

    // do whatever you have to in here

}
}

..

HitCollector collector = new MyCollector();

indexSearch(query,filter,collector);

      

+2


source


For good performance, you will need to index the security information along with each document. This, of course, depends on your security model. For example, if you can assign each document to security roles that have permissions on it, use that. Also check this question . This is almost a duplicate of this.



+1


source







All Articles