Populating a Database in PostgreSQL

Question

Populating a Database in PostgreSQL

The following link in the PostgreSQL documentation manual http://www.postgresql.org/docs/8.3/interactive/populate.html says that to disable autocommit in postgreSQL, you can simply place all insert statements in BEGIN; and COMMIT;

However, I'm having a hard time getting any exceptions that might happen between BEGIN; COMMIT; and if an error occurs (like trying to insert a duplicate PK), I cannot explicitly invoke the ROLLBACK or COMMIT commands. Although all inserts are automatically rolled back, PostgreSQL still waits for an explicit COMMIT or ROLLBACK call before it can consider the transaction aborted. Otherwise, the script must wait for the transaction to complete and any subsequent statements will throw an error.

In a stored procedure, you can use the EXCEPTION clause to do this, but the same does not apply in my circumstances of doing bulk inserts. I tried this and the exception block did not work for me, because the following / s statement executed after an error cannot fail:

ERROR:  current transaction is aborted, commands ignored until end of transaction block

The transaction remains open because it was not explicitly completed with a COMMIT or ROLLBACK call;

Here's some sample code I used for testing:

BEGIN;
  SET search_path TO testing;
  INSERT INTO friends (id, name) VALUES (1, 'asd');
  INSERT INTO friends (id, name) VALUES (2, 'abcd');
    INSERT INTO friends (id, nsame) VALUES (2, 'abcd'); /*note the deliberate mistake in  attribute name and also the deliberately repeated pk value number 2*/
EXCEPTION /* this part does not work for me */
    WHEN OTHERS THEN
        ROLLBACK;   
COMMIT;

When using a method like this, do I really have to ensure that all assertions are successful? Why is this so? Is there no way to catch errors and explicitly trigger a rollback?

thank

+3

performance postgresql bulkinsert bulk population

prince Mar 30 12 at 15:10

source to share

2 answers

FooLman · Answer 1 · 2012-03-30T15:16:18+0000

if you do this between begin and commit, then everything is automatically rolled back in case of an exception. Excerpt from the URL you posted: "An added benefit of doing all inserts in one transaction is that if a single row insertion was unsuccessful, then the insertion of all rows inserted at that point will be discarded, so you won't be stuck in a loop. partially loaded data. "

vyegorov · Answer 2 · 2012-03-30T19:53:46+0000

When I initialize the databases i.e. creating a series of tables / views / functions / triggers / etc. and / or loading in raw data, I always use psql and Variables to control the flow. I always add:

\set ON_ERROR_STOP

at the top of my scripts, so whenever I hit any exception, psql will be canceled. It looks like it might help in your case too.

And in cases where I need to do some exception handling, I use anonymous code blocks like this:

DO $$DECLARE _rec record;
BEGIN
FOR _rec IN SELECT * FROM schema WHERE schema_name != 'master' LOOP
    EXECUTE 'DROP SCHEMA '||_rec.schema_name||' CASCADE';
END LOOP;
EXCEPTION WHEN others THEN
NULL;
END;$$;
DROP SCHEMA master CASCADE;

Populating a Database in PostgreSQL

More articles: