Implementing slowly resizing type 2 in PostgreSQL

I am planning to implement Type-2 SCD in PostgreSQL. The disadvantage of SCD is that you cannot have foreign keys referencing thos tables if used, as is often seen. In other words, I often see that referential integrity is considered in application code. This puts me as bad practice as it can be done directly in the database. Adding multiple triggers can even hide this implementation detail from coder applications.

I have come up with the following schemes. Everything is good?

One-to-many

--
-- One-to-Many
--
BEGIN;

   CREATE TABLE document(
      id serial not null,
      revision integer not null default 1,
      title varchar(30),
      primary key (id, revision)
   );

   CREATE TABLE page(
      id serial not null,
      title varchar(30),
      document_id integer not null,
      document_revision integer not null,
      foreign key (document_id, document_revision) references document(id, revision)
   );


   -- Insert the first revision
   INSERT INTO document (title) VALUES ('my first document');
   INSERT INTO page (title, document_id, document_revision) VALUES ('my first page', 1, 1);

   -- DEBUG: display
   SELECT * FROM document d inner join page p ON ( d.id = p.document_id and d.revision = p.document_revision );

   -- "update" the document, by inserting a new revision
   INSERT INTO document (id, revision, title) VALUES (1, 2, 'my first document, edited');

   -- update the references
   UPDATE page SET document_revision = 2 WHERE document_id = 1;

   -- DEBUG: display
   SELECT * FROM document d inner join page p ON ( d.id = p.document_id and d.revision = p.document_revision );

ROLLBACK;

      

Many-to-one

--
-- Many-to-One
--
BEGIN;

   CREATE TABLE page(
      id serial not null primary key,
      title varchar(30)
   );

   CREATE TABLE document(
      id serial not null,
      revision integer not null default 1,
      title varchar(30),
      page_id integer references page(id),
      primary key (id, revision)
   );

   -- Insert initial revision
   INSERT INTO page (title) VALUES ('my first page');
   INSERT INTO document (title, page_id) VALUES ('my first document', 1);
   INSERT INTO document (title, page_id) VALUES ('my second document', 1);

   -- DEBUG: display
   SELECT * FROM page p inner join document d on (p.id = d.page_id);

   -- destroy the link "from" the old revision
   UPDATE document SET page_id = NULL WHERE id=1;

   -- Add a new revision, referencing the page
   INSERT INTO document ( id, revision, title, page_id ) VALUES ( 1, 2, 'My First Document, edited', 1 );

   -- DEBUG: display
   SELECT * FROM page p inner join document d on (p.id = d.page_id);
   SELECT * FROM document;

ROLLBACK;

      

Many-to-many

--
-- Many-to-Many
--
BEGIN;
   CREATE TABLE page(
      id serial not null primary key,
      title varchar(30)
   );

   CREATE TABLE document(
      id serial not null,
      revision integer not null default 1,
      title varchar(30),
      primary key (id, revision)
   );

   CREATE TABLE page_contains_document(
      page_id integer not null references page(id),
      document_id integer not null,
      document_revision integer not null,
      foreign key (document_id, document_revision) references document( id, revision )
   );

   -- Insert initial revision
   INSERT INTO page (title) VALUES ('My First page');
   INSERT INTO document (title) VALUES ('My Fist Document');
   INSERT INTO page_contains_document (page_id, document_id, document_revision) VALUES (1, 1, 1);

   -- DEBUG: display
   SELECT p.title, d.title, d.revision FROM page p INNER JOIN page_contains_document pcd ON (p.id = pcd.page_id) INNER JOIN document d ON (d.id = pcd.document_id and d.revision = pcd.document_revision);

   -- Add a new document revision
   INSERT INTO document (id, revision, title) VALUES (1, 2, 'My Fist Document, edited');

   -- update the reference
   UPDATE page_contains_document SET document_revision=2 WHERE document_id=1;

   -- DEBUG: display
   SELECT p.title, d.title, d.revision FROM page p INNER JOIN page_contains_document pcd ON (p.id = pcd.page_id) INNER JOIN document d ON (d.id = pcd.document_id and d.revision = pcd.document_revision);

ROLLBACK;

      

+2


source to share


2 answers


OK. You seem to have completely missed the type 2 SCD point.

All data should be stored in one table, enclosed in parentheses by date (not revision numbers!).

so you can:

   id   ,  name    ,  valid_from, valid_to
  1111  , MyBook   , '2009-03-01', '9999-12-31'

After an update:
  1111  , Mybook   , '2009-03-01', '2009-06-20'
  1111  , Mybook   , '2009-06-21', '9999-12-31'

      

A similar structure should exist in the "pages" database operating with date and date.



The point is that now you can either get the latest version with:

select * from books where valid_to = '9999-12-31'

      

Or get a version that has been valid since April 1st

select * from books where valid_to >= '2009-04-01' and valid_from <= '2009-04-01'

      

Also in the page structure, you only need to keep the updated pages. You don't need a new copy of all pages for every revision.

+3


source


I know it's a little late to answer this question, but I think it might be helpful to people looking for the same thing as you.

We have written a module that implements SCD-Type 2 that is used with Django. It has been tested using PostgreSQL, so it should fit your requirements. It also covers OneToMany- and ManyToMany relationships.



See CleanerVersion on GitHub for more details or directly at https://github.com/swisscom/cleanerversion .

+2


source







All Articles