Saving database data changes in commits (if not sqlite)

Question

Saving database data changes in commits (if not sqlite)

We have a Rails project that uses PostgreSQL specific queries, so it is awkward to use sqlite even in development mode. The problem is I would like to track schema changes in the database and skip on-demand migrations in progress, and I would also like to track db data changes with git so that I would not need to dump the db and upload it to my machine ... So basically I want to do a "git pull" and see the application working with the new schema and data.

What are the possible strategies here? The only thing that comes to my mind is to use a simple wrapper that takes the sql query, checks if it has any db-specific details, and rewrites them for the development environment so that we can still use sqlite. What else?

+1

version-control database

snitko Dec 15. '08 at 1:33

source to share

2 answers

Jonathan Leffler · Answer 1 · 2008-12-15T06:32:49+0000

I'm not sure I understand all the nuances of your question - especially the comments about using SQLite vs PostgreSQL. If it is a multi-DBMS system, then testing with multiple systems is good; if it should be a system with one DBMS, then working with several DBMS makes life senselessly hard.

Also, you are talking about tracking schema changes in the database ... is it keeping schema change information separate from the DBMS's own system directory, or do you really mean you want to track database schema changes (using something outside of the database - e.g. VCS)?

You're also talking about tracking "database data changes" which I use to mean "data in tables in a database". Again, I don't understand if you are thinking of some kind of data dump from the database that covers the differences between what was there, for example, a day ago and what is now, or whatever.

These problems may be the reason why you have not received a response for more than 4 hours.

When you talk about a "simple shell," you are not talking about something that I would call simple. It has to parse arbitrary SQL, figure out if any of them are specific to the DBMS, and then apply the rewrite rules. This is a non-trivial commitment. Getting the wrapper called in the correct places can also be nontrivial - it depends on the set of APIs you use to access the DBMS, among other things.

What else?

Use the same DBMS in both production and development?
Tracking only schema changes is not trivial. You need to keep track of the gist of the schema (like table name, column names, etc.), not what happened (yes, I reread Brooks' No Silver Bullet before) such as TabID (which can change without schemes are significantly different). However, the analysis will tell you if the schema is different.
Tracking data changes regardless of schema changes is also non-trivial. In general, the amount of such data is large. You can deal with a full archive or a full dump or export of the database, but provided that the data is presented in the same sequence, some caution may be required on your part each time. If you are unsure of the correct sequence, VCS will record huge changes due to order differences.

All of the above is consistent with the dreaded "it depends" answer. It depends on the:

Your DBMS
The size of your database
The volatility of your scheme
The volatility of your data

It only slightly depends on your VCS or platform, thankfully.

Taichman · Answer 2 · 2011-11-15T11:54:38+0000

If I understand correctly, you want to track both schema changes and data.

These are, in fact, two very different things -

Schema changes - this is discussed in several other questions ( here , here and here ). The main one of the answers to these questions is that you can either dump your schema into SQL files, or track them with a regular source of control (git, svn, etc.), or you can use a DB-specific SW (red- shutter, dbmaestro). However, this will not allow you to completely recreate an identical copy of the database on another server. What brings me down
Data changes . It's trickier because (as @jonathan wrote) it's hard to keep track of the changes the DB makes to the files. I suggest you check OffScale DataGrove . DataGrove tracks changes in the entire database (structure + data). You can tag versions at any point in time and revert to older database states with a simple command. It also allows you to create virtual, separate copies of the same database so that each team member can have their own separate database. All virtual copies are tracked to the same repository, so it is very easy to revert your database to a different version (what you called "git-pull").

Disclaimer - I work for OffScale :-)

Saving database data changes in commits (if not sqlite)

More articles: