Zend_Search_Lucene query parsing issue

Here's the setup, I have a Lucene index and it works well with 2000 documents I've indexed. I am using Luke (Lucene Index Toolbox, v.0.9.2) to debug queries and am using ZF 1.9.

The layout for my Lucene index looks like this:

I = Indexed
T = Tokenized
S = Stored

Fields:
author - ITS
category - ITS
publication - ITS
publicationdate - IS
summary - ITS
title - ITS

      

Basically, I have a form that can be found on the above fields, allowing you to mix and match any of the above data and parse it into a zend luceue request. This is not a problem, the problem is that when I start combining terms, the "optimize" method that fires inside find makes the query just disappear.

Here's an example search I'm running right now:

Form version:

Title: test title
Publication: publication name

      

Lucene Query Analysis:

+(title:test title) +(publication:publication name)

      

Now if I take this query string and remove it in LUKE and hit Search, it returns the results just fine. When I use the query search method it explodes. So I did a little research on how it functions and found the problem (I believe)

First of all, it displays the actual lines of code that are being searched:

$searchQuery = "+(title:test title) +(publication:publication name)";
$hits = new ArrayObject($this->index->find($searchQuery));  

      

This is a simplified version of the actual code, but what it generates.

Now, what I noticed after some debugging, the "optimize" method just ruins the query itself. I created the following code:

$rewrite = $searchQuery->rewrite($this->index);
$optimize = $searchQuery->rewrite($this->index)->optimize($this->index); 
echo "======<br/>";
echo "Original: ".$searchQuery."<br/>";
echo "Rewrite: ".$rewrite."<br/>";
echo "Optimized + Rewrite: ".$optimize."<br/>";
echo "======<br/>";  

      

Which outputs the following text:

======
Original: +(title:test title) +(publication:publication name)
Rewrite: +(title:test title) +(publication:publication name)
Optimized + Rewrite: 
======

      

Note that the third exit is completely empty. It looks like the Rewrite and Optimize in the query is causing the query string to just empty itself.

Does anyone know why the optimization method is just deleting my query? Are you missing a filter or some interface that you might need to disassemble? All queries work fine when I insert them into LUKE and run them against the index manually, but something stupid is happening with the way Zend parses the query to perform a search.

Any help is appreciated.

+2


source to share


1 answer


I'll be honest, Zend_Search_Lucene (ZSL) doesn't work or is supported for a long time.

This is also conceptually wrong. Let me explain why: Search engines can quickly respond to search queries, the problem with ZSL is that it is implemented in pure PHP. This means that on each request, all index files are read and reloaded again , continuously. It cannot be fast.



There is nothing wrong with Lucene itself, there is even a very good alternative called Solr which is based on Lucene: it is a search server implemented in Java that can index and answer all your Lucene queries. Due to the nature of the Solr server, you don't suffer from poor performance by reloading all the Lucene files over and over.

This is slightly different from what you asked, I waited two years for my ZSL errors to be resolved, now this happens using Solr :)

+6


source







All Articles