Multiple db connections versus centralized / redundant db

I have a project to create a dashboard that will connect to existing systems as well as create new functionality based on combining data from existing systems. For example, the dashboard will be able to generate "orders" containing data combined from "members" (MS Access DB), "employees" (MySQL DB) and "products" (flat file), and there will also be new attributes on "orders ".

At first, I thought it would be most efficient to have my application connect to each of the systems separately and cross-vendor connections between different databases. But then I thought that creating a centralized / redundant db (built with scripts pushing and pulling data between systems) might also be useful, because it would allow some semi-techies to use products like OOBase that can only create one connection.

Are there any other benefits to creating a centralized / redundant database like the one I'm talking about? Or are multiple direct connections the best approach?

Thanks in advance for any advice.

+2


source to share


2 answers


To give you a short answer: yes, you want to have a central data store.

You don't want to run complex reports on your database. As your live database grows, you'll want to tidy up and clean it up, but save the data for analysis.

You also want the data to be aggregated so that you can perform historical analysis.

Data coming from different sources will require some cleaning. And you probably need to know how to tie your data together, and there are many questions you need to know to get the job done right.



You might consider reading on data warehouse ( wikipedia ) and business intelligence ( wikipedia ).

If you want to add "new features" to this system, you can also find orchestration ( wikipedia . This will allow you to tie your heterogeneous business processes together.

These are all fairly specialized and complex disciplines in themselves, so you can ask a specialist to consult with you.

+2


source


Be very careful to copy a lot of data. If you do, here are some important guidelines:



  • Make sure one system is designated as the master and no other system can tamper with the data.

  • Always copy data from master to slaves.

  • When copying data, use a checksum to ensure that all data has been copied. Make sure you can handle "yesterday, copy failed".

  • If the slave needs to make a change, push the change to the master and then use the standard "update" path to merge it back into the slave. Avoid "saving changes to the slave and updating the master in the future."

+1


source







All Articles