Can I use Lucene to find business applications?

I have a typical enterprise / business application that I am developing, including orders, salespeople, contacts, reference data, etc. The system will have at least 100 or more users who enter new data, change data, etc. I need to make my application searchable for almost all tables.

One option is to run table queries like "select * from sales salesons where name contains" searchtest "or something similar. But I was wondering if I could use Lucene (.net) for this.

The main thing is that the search should reflect the changes within a few seconds. So if a user enters an order, for example, and then immediately searches for it right after that, it should appear in the search list. (i.e. I cannot have an index job every hour or half an hour, or at night, etc.).

Is this something that will work well, or is there a better option?

+2


source to share


2 answers


I have implemented something almost identical to what you are describing. The indexed table was huge (> 5 hours to index with lucene) and it took the lookup to reflect changes in the DB within 5 minutes. There are two approaches I have considered (I implemented the first one):



  • Index the table step by step. Each line had a timestamp (last change). Every 5 minutes, a cron job starts a java process that reads the lines that have changed since the last start, creates a text version of them, and then updates the lucene index. Incremental indexing will lock the table for 200-300ms for about 1000 table rows. Obviously it depends on your system, database schema, etc. However, my experience is that it is definitely practical to implement this. And lookups are orders of magnitude faster with lucene than with query.

  • Use a dedicated thread for indexing. Whenever something changes in the DB, the code that actually runs the SQL query has to send a message (via LinkedBlockinQueue) to the thread that updates the lucene index. This way your updateDB () method on the main thread returns immediately after the database is updated and does not have to wait for the lucene indexing process, whereas indexing happens as soon as possible (usually a few ms later). One drawback is that lucene uses locks stored on disk. So I guess there is an overhead of updating the indexing for every single row (although I haven't run any test yet). A workaround would be to keep the update buffer in your index thread and flush them to disk every few seconds (again,performance depends on the ratio of updates to queries by index).

+2


source


Yes, you can certainly use Lucene for this use case. I see some disadvantages:

  • You will be copying most of the information in the index (and you will be implementing something to keep the index and database in sync, which may not be trivial.)
  • You will be hitting the database frequently (or delaying inserts, or just putting a heavy load, depending on how you create it) to create this index.
  • Realtime search is only done on the latest version of the official Lucene . I am not aware of the status of Lucene.net in this regard.

And the (big) potential:



  • Lucene is likely to outperform both performance and quality of results in full-text database indexing.

The answers to this question may help Lucene.Net Best Practice

+4


source







All Articles