Can Lucene.net be used for a tag-based search engine?

I am developing an ASP.Net MVC3 application which will have several hundred videos. I want to create a search engine based on tags and other parameters like the type of user who uploaded the video, the date of the video, the category of the video, etc.

I look around and Lucene.NET seems to be a really good full text search tool, but I don't know if this is the best for my project ... I read the tutorials and they recommend keeping the search with a minimum index, but also what you don't have to hit their database for extra data that is not stored in the search index ...

How is this possible?

Let's take an example: I have a video string (as a concept, this is actually done in different SQL tables) that has columns for the video id, video name, video file name, full path, user id, user type, tags, creation date, video category, video subcategory, video location, etc. If I want to create a lucene search index, I think I will need to put all the information there so that later I can query for each parameter, right?

It seems to me to be a duplicate SQL database, but with an overload of adding, editing and removing documents from the lucene search index. Is this the default scenario when using lucene? All the examples I've seen with lucene are based on the mailbox, message header, and message body.

What do you think? Can you give me some light?

+3


source to share


1 answer


Yes, if you want to query multiple fields (including things like tags) from lucene, you need to make that data available to lucene. It might sound like duplication, but it's not redundant duplication - it's a restructuring of the data into a completely different layout, indexed for search.

It should work fine; this is almost how search works here on stackoverflow (which uses lucene.net to do searches).



It should be noted, however, that a few hundred is not a large sample: to be honest, you can do it however you like and it takes about the same amount of time. Writing a complex SQL query should work as well as a full-text database search (like stackoverflow search was used), and for filtering objects in memory (at a few hundred level, you could just cache all data, excluding video frames in memory).

+2


source







All Articles