Updating denormalized database tables

I am using Ruby on Rails 3.0.7 and MySQL 5. In my application I have two database tables like TABLE1 and TABLE2, and for performance reasons I have denormalized some of the data in table 2 so that I have duplicate TABLE1 values ​​won that one. Now in TABLE1, I need to update some of these values, and of course, I have to correctly update the denormalized values ​​in TABLE2 too.

What can I do to update these values ​​in a performance manner? That is, if TABLE2 contains many values ​​(1.000.000 or more), what is the best way to support updating both tables (methods, pratices, ...)?

What can happen in the time it takes to update the database tables? For example, might the user have some problems when joining some pages of the website that include these denormalized values? If so, what is it and how can I handle the situation?


source to share

2 answers

There are several ways to deal with this situation:

  • You can use a database trigger. It is not a database agnostic setting and lacks RoR support as far as I know. If your situation calls for absolutely no data inconsistency, this is probably the most efficient way to achieve your goal, but I am not a DB expert.
  • You can use a periodic operation to periodically synchronize two tables. This method allows your two tables to drift apart and then resynchronize the data as often. If your situation allows this drift to occur, this might be a good option, as it allows the database to be updated while it is running. If you need to sync every 5 minutes, you probably want to explore other options. This can be handled with your ruby ​​code, but it will require a background helper (cron, delayed_job, redis, etc.).
  • You can use a callback inside your Rails model. You can use "after_update :sync_denormalized_data"

    . This callback will be wrapped in a database-level transaction (assuming your database supports transactions). You have Rails-level code, consistent data, and no need for a background process by creating two records each time.
  • Some mechanisms I hadn't thought of ...

These types of problems are very application specific. Even in the same application, you can use multiple methods depending on your flexibility and performance requirements.



Or you can maintain a normalized dataset and have two denominated tables. And sync them periodically. Another way is to have a normalized table structure to support the data (insert / update / delete) and write a materialized view for reporting, which is exactly what you achieve with a non-normalized view. you can set data refresh options for materialized views according to your requirements.



All Articles