Postgresql: In-Row vs Out-of-Row for text / varchar

A question with two parts:

  • What is Postgresql behavior for storing text / varchars in-row vs out of line? Am I correct in assuming that with default settings all columns will always be stored in a row until the 2KB size is reached?

  • Do we have control over the behavior described above? Is there a way to change the threshold for a specific column / table, or to force a single column to always be stored outside the row?

I've read the PostGresql Toast documentation ( http://www.postgresql.org/docs/8.3/static/storage-toast.html ) but I don't see any option to change the thresholds (by default it seems to be 2kB-for-row ) or to force the column to always be stored outside of the row (EXTERNAL only allows it, but doesn't execute it).

I found documentation explaining how to do this on SQL Server ( https://msdn.microsoft.com/en-us/library/ms173530.aspx ) but don't see anything like it for PostGresql.


If anyone is interested in my motivation, I have a table that has a mix of short columns (ids, timestamps, etc.), a column that is varchar (200), and a column that is text / varchar (max). which can be extremely large in length. I currently have both varchars stored in a separate table to provide efficient storage / search / scan on short-matched columns.

This is a pain, however, because I constantly have to make connections to read all the data. I would really like to keep all of the above fields in one table and tell Postgresql to force the 2 VARCHARs to keep out of the row, always.

+3


source to share


1 answer


Edited answer

In the first part of the question: you are right (see for example this ).

In the second part of the question, the standard way to store columns is to compress variable length text fields if they are larger than 2KB and ultimately store them in a separate area called the "TOAST table".

You can give a "hint" to the system on how to save the field by using the following command for your columns:

ALTER TABLE YourTable
  ALTER COLUMN YourColumn SET STORAGE (PLAIN | EXTENDED | EXTERNAL | MAIN)

      



From the manual:

SET STORAGE

This form sets the storage mode for the column. This determines whether this column is supported inline or in the secondary TOAST table, and whether the data should be compressed or not. PLAIN

should be used for fixed length values ​​such as integer and inline, uncompressed. MAIN

intended for embedded, compressible data. EXTERNAL

is intended for external uncompressed data, and EXTENDED

for external compressed data. EXTENDED

the default is used for most data types that support non-PLAIN storage. Using EXTERNAL

will make substring operations on very large text, and bytea values ​​will run faster as memory increases. note thatSET STORAGE

by itself doesn't change anything in the table, it just sets that the strategy will be executed during future updates to the table. See section 59.2 for more information.

Since the guide is not entirely explicit at this point, this is my interpretation: the final decision on how to preserve the field is left to the system anyway, given the following constraints:

  • No field can be saved in such a way that the total line size ends up in 8KB
  • No field is kept off-line if its size is less than TOAST_TUPLE_THRESHOLD

  • After fulfilling the previous constraints, the system tries to satisfy the strategy SET STORAGE

    specified by the user. If no storage strategy is specified, each TOAST-capable field is automatically declared EXTENDED

    .

Under this assumption, the only way to ensure that all column values ​​are stored off-line is to recompile the system with a value TOAST_TUPLE_THRESHOLD

less than the minimum size of any column value.

+2


source







All Articles