How to Implement Versioned Rows in PostgreSQL?

How to Implement Versioned Rows in PostgreSQL?

Post by Alan Gutierre » Tue, 07 Jan 2003 19:20:25



My application is gathering personel and patient data for a hospice. It must
keep track of all changes to patient stats, chart, and med sheet over time.
Therefore, I would like to keep versions of the rows in many of my tables.

This is my stab at a PostgreSQL implementation. I would greatly appreciate any
input, criticisms, dire warnings, etc.

I plan on adding a transaction table with a transaction_id column fed by a
sequence. The transaction_id will indicate the order of creation as well as
time of creation.

-- A row for each of our customers who subscribe to the application.

create table firm (                 -- firm => hospice, clinic, hospital
     firm_id integer not null,
     name varchar(32),               -- Just one example data column
     primary key (firm_id)
)
\g

-- A table to keep row of a transaction.

create table transaction (
     firm_id integer not null,
     transaction_id integer not null, -- Fed by sequence, one for each firm so
                                      -- we can part and merge databases by
                                      -- firm without collision!
     modified timestamp not null,
     modified_by person_id not null,
     primary key (firm_id, transaction_id),
     foreign key (firm_id) references firm
)
\g

-- Example versioned table.

create table person_history (           -- Base for patient and employee
     firm_id integer not null,
     person_id integer not null,
     transaction_id integer not null,
     first_name varchar(32),             -- Just two example data columns
     last_name varchar(32) not null,
     deleted boolean not null,
     primary key (firm_id, person_id, transaction_id)
)
\g

-- Show latest row view.

create view person as
select *
   from person_history
  where transaction_id  = ( -- In explain this subselect appears to use index!
            select max(transaction_id)
              from person_history as ph1
             where firm_id = ph1.firm_id
               and person_id = ph1.firm_id
        )
    and deleted = 0
\g
-- Time travel view.
create view person_as_of as
select *
   from person_history
  where transction_id = (
            select max(transaction_id)
              from person_history as ph1
             where firm_id = ph1.firm_id
               and person_id = ph1.firm_id
               and transaction_id <= (
                       select transaction_id
                         from past_transaction
                        limit 1
                   )
        )
    and deleted = 0
\g

In my application I can travel in time thus:

create temporary table past_transaction as
select transaction_id
   from transaction
  where modified <= '2002/12/2 17:59:00' -- the minute I turned 31
  order by firm_id desc, transaction_id desc,
  limit 1
\g
-- If only I could pass the view a parameter!
select * from person_as_of
\g

Thoughts:

* I can write a query and have it travel back in time by setting one variable.
   Neeto.

* No archive tables or such means no copying, new version is a simple insert.
   Good.

* With expliain the sub selects appear to use the indexes with aggregates, and
   if not I can alsways sort descending limit 1. Good.

* Even with complex joins on the latest view tables the query plans use
   the primary index for the sub select. Good.

* There is little need for vacuuming, since no updates are made to the
   busy tables of the application. Does this matter?

* Referenital integrity goes away from what I can see, since it won't
   understand the deleted column. Pity.

Questions:

* Is this viable or overly clever?

* Should I have a boolean latest column on a versioned table? This would mean
   update and vacuum, but potentially a faster query.

* Is there a penalty for long (how do you say?) concatenated keys in
   PostgreSQL?

* Any reason why this won't work with the coming distrubuted PostgreSQL?

* Where can I read about alternative implemenations for history/versioning?
   Nuances?

Thank you all for any input whatsoever.

Alan Gutierrez

 
 
 

How to Implement Versioned Rows in PostgreSQL?

Post by Tom La » Wed, 08 Jan 2003 01:24:58



> * Where can I read about alternative implemenations for history/versioning?

Have you looked at contrib/spi/timetravel.* ?

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

 
 
 

How to Implement Versioned Rows in PostgreSQL?

Post by Alan Gutierre » Wed, 08 Jan 2003 02:57:19




>>* Where can I read about alternative implemenations for history/versioning?

> Have you looked at contrib/spi/timetravel.* ?

>                    regards, tom lane

Thank you.

The timetravel.* uses dates on each table. The nice thing about the transaction
table scheme I've concocted, I think, is that I get a nice round sequential
number. Dates seem so fiddly. I also get to see what changed during a
transaction. I can also store who changed it with the transaction record.

 From looking at timetravel.*, at least I know I am not too ambitious. It seems
acceptable to have a view on the tables filtering out records. I've not had
experience with *large* databases yet. My big concern is that this versioning is
not going to scale. That the extra query for each row will be a real dog.

But, hey, the application needs versioning, gotta pay to play, right?

Maybe I can hack timetravel.* to do my bidding?

If anyone wants to caution (or reassure) this newbie, I would be very grateful.

Thank you.

Alan Gutierrez