Postgres full text search with multiple columns, why concat in index and not at runtime?
I have found full text search in postgres in the past few days and I am a little confused about indexing when searching across multiple columns.
Postgres docs talk about creating an index ts_vector
on concatenated columns, for example:
CREATE INDEX pgweb_idx ON pgweb
USING gin(to_tsvector('english', title || ' ' || body));
which I can find like this:
... WHERE
(to_tsvector('english', title||' '||body) @@ to_tsquery('english', 'foo'))
However, if I wanted to sometimes search for just the name, sometimes just the body, and sometimes both, I would need 3 separate indices. And if I add in the third column, which could potentially be 6 indices, etc.
An alternative that I haven't seen in the docs is to just index the two columns separately and then just use a normal WHERE...AND
query:
... WHERE
(to_tsvector('english', title) @@ to_tsquery('english','foo'))
AND
(to_tsvector('english', body) @@ to_tsquery('english','foo'))
Benchmarking two by ~ 1 million lines seems to be pretty much the same as performance.
So my question is:
Why do I need to concatenate indexes like this and not just index the columns separately? What are the advantages / disadvantages of both?
My best guess is that if I knew ahead of time I would like to always look for both columns (never one at a time), I would only need one index, concatenating, using less memory.
Edit
source to share
- Using one index is easier / faster for the database;
- It is difficult to rank the results correctly when using two indices:
- You can assign relative weights to columns when creating a single index, so a match in
title
will be worth more than a match inbody
; - You are looking for a single word here, what happens if you search for multiple and they appear separately in different columns?
source to share