Postgresql large table update slows down

I am running an update on a large table (e.g. 8GB). This is a simple update of 3 fields in a table. I had no problem running it in postgresql 9.1, it will take 40-60 minutes, but it worked. I run the same query on a 9.4 database (just created, not updatable) and it starts updating but then slows down. It only uses ~ 2% CPU, level if IO is 4-5 MB / s and it sits there. No locks, no other queries or connections, just this one SQL update on the server.

Below is the SQL. The lookup table has 12 entries. The search can only return one row, it splits the discrete scale (SMALLINT, -32768 .. +32767) into non-overlapping regions. The "src" and "dest" tables are ~ 60 million records.

UPDATE dest SET
    field1 = src.field1,
    field2 = src.field2,
    field3_id = (SELECT lookup.id FROM lookup WHERE src.value BETWEEN lookup.min AND lookup.max)
FROM src
WHERE dest.id = src.id;

      

I thought my disk was slow, but I can copy 1GB files in parallel with the query and it is fast s> 40MB / s and I only have one disk (this is a virtual machine with iSCSI media). All other disk operations are unaffected; there is more I / O throughput. At the same time, PostgreSQL just sits there very little, works very slowly.

I have 2 virtualized Linux servers, one running postgresql 9.1 and the other running 9.4. Both servers have close to identical postgresql configuration.

Does anyone else have similar experiences? I'm running out of ideas. Reference.

Edit The request "ran" for 20 hours, I had to kill connections and restart the server. Surprisingly, it didn't kill the connection via the request:

SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE pid <> pg_backend_pid() AND datname = current_database();

      

and sever produced the following log:

2015-05-21 12:41:53.412 EDT FATAL:  terminating connection due to administrator command
2015-05-21 12:41:53.438 EDT FATAL:  terminating connection due to administrator command
2015-05-21 12:41:53.438 EDT STATEMENT:  UPDATE <... this is 60,000,000 record table update statement>

      

Also the server reboot took a long time creating the following log:

2015-05-21 12:43:36.730 EDT LOG:  received fast shutdown request
2015-05-21 12:43:36.730 EDT LOG:  aborting any active transactions
2015-05-21 12:43:36.730 EDT FATAL:  terminating connection due to administrator command
2015-05-21 12:43:36.734 EDT FATAL:  terminating connection due to administrator command
2015-05-21 12:43:36.747 EDT LOG:  autovacuum launcher shutting down
2015-05-21 12:44:36.801 EDT LOG:  received immediate shutdown request
2015-05-21 12:44:36.815 EDT WARNING:  terminating connection because of crash of another server process
2015-05-21 12:44:36.815 EDT DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.

      

"Postmaster ordered the server process to rollback the current transaction and exit because another server process has crashed and possibly corrupted shared memory ) is this a symptom of a bug in PostgreSQL?

Edit I've tested 9.1, 9.3 and 9.4. Both 9.1 and 9.3 experience no slowdown. 9.4 consistently slows down large transactions. I noticed that when starting a transaction, the htop monitor shows high CPU and the process status is "R" (running). Then it gradually changes to low CPU consumption and status "D" - disk (see screenshot Disk waiting). My biggest question is, why is 9.4 different from 9.1 and 9.3? I have a dozen servers and this effect is seen across the board.

+3


source to share


3 answers


Thanks everyone for the help. No matter how much I tried to highlight the performance difference between an identical configuration 9.4

and previous versions, nobody really paid attention to it.

The problem was solved by disabling transparent huge pages:



echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo never > /sys/kernel/mm/transparent_hugepage/defrag

      

Here are some resources I found helpful in solving the problem:
* https://dba.stackexchange.com/questions/32890/postgresql-pg-stat-activity-shows-commit/34169#34169
* https://lwn.net / Articles / 591723 /
* https://blogs.oracle.com/linux/entry/performance_issues_with_transparent_huge

+3


source


I would suspect a lot of disk seek - 5MB / s is practically fine for very random I / O on a regular (spinning) hard drive.

Since you are constantly replacing basically all your rows, I would try to set the dest

table fillfactor to around 45% ( alter table dest set (fillfactor=45);

) and then cluster test using test_pkey;

. This will allow updated row versions to be placed in the same sector of the disk.

Also, using cluster src using src_pkey;

so that both tables can have data in the same physical order on disk can help as well.



Also remember vacuum table dest;

after every major update, so old row versions can be used again in subsequent updates.

Your old server has probably generated its fillfactor naturally during multiple updates. On the new server, it is 100% packed, so the updated lines should be posted at the end.

0


source


If only some of the target lines are actually being updated, you can avoid creating new versions of the lines with DISTICNT FROM

. This can prevent a lot of useless disk traffic.

UPDATE dest SET
    field1 = src.field1,
    field2 = src.field2,
    field3_id = lu.id
FROM src
JOIN lookup lu ON src.value BETWEEN lu.min AND lu.max
WHERE dest.id = src.id
        -- avoid unnecessary row versions to be generated
AND     (dest.field1 IS DISTINCT FROM src.field1
        OR dest.field1 IS DISTINCT FROM src.field1
        OR dest.field3_id IS DISTINCT FROM lu.id
        )
        ;

      

0


source







All Articles