Lucene, or sql fulltext?
I want to create a search site for searching documents (all kinds of formats including pdf), images, video and audio. I also want to be able to filter my search results based on some criteria like author name, date, etc.
I do this in .NET, so what's the easiest way to get up and work? SQL full text search seems tempting because I am familiar with sql and plus, since I want to filter the search results, it will be easy to store filter fields for each item.
source to share
If your main concern is getting up and running quickly and easily, then SQL full text search is definitely the way to go.
Lucene.NET has its advantages, but it's far from just a walk in the park to get it right. The documentation is a bit lacking and there are a very limited number of examples on the internet.
source to share
Stored Procedure for Fragments:
CREATE PROCEDURE SimpleCommentar
@SearchTerm nvarchar(100),
@Style nvarchar(200)
AS
BEGIN
CREATE TABLE #match_docs
(
doc_id bigint NOT NULL PRIMA
);
INSERT INTO #match_docs
(
doc_id
)
SELECT DISTINCT
Commentary_ID
FROM Commentary
WHERE FREETEXT
(
Commentary,
@SearchTerm,
LANGUAGE N'English'
);
DECLARE @db_id int = DB_ID(),
@table_id int = OBJECT_ID(N'
@column_id int =
(
SELECT
column_id
FROM sys.columns
WHERE object_id = OBJECT_I
AND name = N'Commentary'
);
SELECT
s.Commentary_ID,
t.Title,
MIN
(
N'...' + SUBSTRING
(
REPLACE
(
c.Commentary,
s.Display_Term,
N'<span style="' + @Style + '">' + s.Display_Term + '</span>'
),
s.Pos - 512,
s.Length + 1024
) + N'...'
) AS Snippet
FROM
(
SELECT DISTINCT
c.Commentary_ID,
w.Display_Term,
PATINDEX
(
N'%[^a-z]' + w.Display_Term + N'[^a-z]%',
c.Commentary
) AS Pos,
LEN(w.Display_Term) AS Length
FROM sys.dm_fts_index_keywords_by_document
(
@db_id,
@table_id
) w
INNER JOIN dbo.Commentary c
ON w.document_id = c.Commentary_ID
WHERE w.column_id = @column_id
AND EXISTS
(
SELECT 1
FROM #match_docs m
WHERE m.doc_id = w.document_id
)
AND EXISTS
(
SELECT 1
FROM sys.dm_fts_parser
(
N'FORMSOF(FREETEXT, "' + @SearchTerm + N'")',
1033,
0,
1
) p
WHERE p.Display_Term = w.Display_Term
)
) s
INNER JOIN dbo.Commentary c
ON s.Commentary_ID = c.Commentary_ID
INNER JOIN dbo.Book_Commentary bc
ON c.Commentary_ID = bc.Commentary_ID
INNER JOIN dbo.Book_Title bt
ON bc.Book_ID = bt.Book_ID
INNER JOIN dbo.Title t
ON bt.Title_ID = t.Title_ID
WHERE t.Is_Primary_Title = 1
GROUP BY
s.Commentary_ID,
t.Title;
DROP TABLE #match_docs;
END;
source to share