PostgreSQL is fast looking for columns with arrays of strings?

According to Can PostgreSQL columns of index column? PostgreSQL can index columns of arrays.

Can it search on an array column as efficiently as it does for non-array types?

For example, suppose you have a row from a question table (like SO):

title: ...
content:...
tags: [ 'postgresql', 'indexing', 'arrays' ]

      

And you want to search for questions with a tag 'postgresql'

. Will storing the relation in the join table be faster for lookups?

And yes, every column will have an index.

+3


source to share


1 answer


GIN and GiST indexes are usually larger than a simple b-tree and take longer to scan. GIN is faster than GiST due to very expensive updates.

If you store tags in a column of an array, then any update to a row usually requires an index update in the array. In some cases, HOT will allow you to skip this, but it's not something you can rely on. This way you will have more index updates and more index bloat.

On the other hand, you avoid having to scan the b-tree for the ids of the objects you want and then retrieve them from the main table through the join. You also save fair space by using an array instead of paying 28 bytes for each overhead for each tag in the join table.



If your speed of inserting and updating in the main table is rather slow - including changes to tags - then GIN may be the right choice. Otherwise, I would probably go for a typical b-tree in the join table with a secondary index on (tag, object_id)

, so that only indexed checks can be used to find objects that have a given tag.

In the end, the best thing to do is compare it to simulate your workload.

+1


source







All Articles