Are redundant columns in SQL useful if I'm expecting more reads than I write?

Suppose I have a database that contains numeric ratings for a store, and my application needs to be able to read the average store ratings:

Store table: id (pk), name (varchar), average_rating (float - reserve column)

Rating table: id (pk), rating_num (int), shop_id (int)

Obviously, it is best to simply recalculate the average rating from the rating table for consistency; however, from the previous version of this app, it is expected that about 80% of requests will be read to get the average store rating . In other words, writing ratings will be much less common than reading store average ratings.

If I had to structure my database this way, I wouldn't need an additional connection or query to navigate to the rating table. Is there any caveat?

+3


source to share


2 answers


I would say this is perfectly fine and will probably save some reading traffic, especially if you need to search for the store name anyway.

You will likely update the average rating within the transaction where you enter the new rating. An alternative is to create a trigger for this. Which one you prefer is more a matter of taste, since the work done will be about the same.



You will then need the key on (shop_id, rating_num)

to make the calculation of the new average store rating effective (assuming that rating_num

is the actual rating score.)

+1


source


The best way to solve this scenario in Sql Server is Indexed View . Oracle , PostgreSQL , and MySql call them Materialized Views.



An indexed view can handle automatically updated average ratings without having to duplicate it in a table. Or of course the data is still duplicated in the view. The difference is that you only need to tell Sql Server what this data looks like. You don't need to manage to keep it up to date.

+1


source







All Articles