Methods for comparing data between different schemas

Are there methods for comparing the same data stored in different schemas? The situation is as follows. If I have a db with schema A and it stores data for a function, say 5 tables. Scheme A -> Scheme B is executed during the update process. During the update process, some transformation logic is applied and the data is stored in 7 tables in schema B. What I need is some way to check the data integrity, basically I would have to compare different schemas when factoring in the transformation logic. Other than writing some custom t-sql sprocs to compare data, is there an alternative method? I'm leaning towards python to automate this, are there any python modules that can help me? To better illustrate my question,the following diagram is a rough picture of one of the many datasets I need to compare, properties 1,2,3 and 4 are carried over from schema source to destination, but propagated across different tables.

Table1Src                             Table1Dest
  |                                       |
  --ID(Primary Key)                       --ID(Primary Key)
  --Property1                             --Property1
  --Property2                             --Property5
  --Property3                             --Property6

Table2Src                             Table2Dest
  |                                       |
  --ID(Foreign Key->Table1Src)            --ID(Foreign Key->Table1Dest)
  --Property4                             --Property2
                                          --Property3

                                      Table3Dest
                                          |
                                          --ID(Foreign Key->Table1Dest)
                                          --Property4
                                          --Property7

      

0


source to share


3 answers


Basically, you should create object representations for both versions of the schema and then compare the objects. This is best done if they all fit into memory at the same time; if not, you need to iterate over all the objects in one view, select the corresponding object in the other view, compare them, and then do the same in reverse.



The hard part can be getting the representations of the objects; you can see if SQLAlchemy can use your tables conveniently. SQLAlchemy is, in principle, capable of mapping existing schema definitions to objects.

+1


source


Make "views" on both diagrams that translate into the same data view of your business data. Export these views to flat files and then you can use any regular vanilla file comparison utility to compare and differentiate.



+2


source


I have successfully used SQLAlchemy to migrate between one schema and another - which is a similar process (as Martin vs. Lewis points out) as a comparison. Especially if you are using the .equals (other) method.

0


source







All Articles