RDS Read Replica Considerations
We hired an intern and we want to let him play with our data to create useful reports. Currently, we just took a snapshot of the database and created a new RDS instance that we gave it access to. But this was deprecated almost immediately due to changes in the production database.
We would like to live (or live close) a mirror of our actual database that we can give it access to without worrying about it modifying any real data or accidentally knocking down our production base (like a silly query like
SELECT (*) FROM ourbigtable
or really slow connection).
Will there be a suitable read replica for this purpose? It looks like it will at least stay up to date, but I don't understand what happens if a read replica goes down or data is accidentally changed to it, or some other potential liability.
The only thing I could find related to this was this SO question and it bothers me a little (emphasis mine):
If you are trying to pre-compute a large amount of data and otherwise modify that on a read replica you need to be very careful you are not modifying the data - if the read is no longer compatible, then you are in trouble :)
TL; DR Don't do this unless you really know what you are doing and you understand the full implications.
And straight forward, MySQL replication can be dodgy in my experience, so even knowing what needs to happen and what happens if there is a master trying to write updated data to a slave, you're also updated .... who knows.
Is there any risk to the production database if we allow the trainee to have an unwritten read replica on it?
source to share
We have continued to read replicas of our production bases for a couple of years now without any significant issues. All our sales, marketing, etc. People who need the ability to run queries are given access to a replica. It worked well enough and was stable for the most part. The production databases are locked so only our applications can connect to them and the read replicas are only accessible via SSL from our office. Configuring security is very important as you would create all user accounts in the master database and then replicate them to read-replica.
I think we've seen a read-replica go into a bad state sometime due to a hardware problem. The great thing about read-replicas is that you can just finish one and create a new one anytime you want / need. As long as the new replica has the same instance name as the old one, its DNS, etc. Will remain unchanged, so if it is not brief and unavailable, everything should be transparent enough for end users. Once or twice we also just rebooted the stuck read-replica and it was able to catch up too.
It is not possible to update data on read-replica in any way other than processing commands sent from the main database. RDS just won't let you run something like insert, update, etc. On read-replica no matter what permissions the user has. Therefore, you do not need to worry about changing data to read-replica, which causes synchronization with the master to fail.
Sometimes the replica can lag slightly behind the production database if someone is submitting a long query, but it usually recovers quickly after the query completes. In all of our production environments, we have multiple monitors to monitor replication as well as validate long running requests. We use the pmp-check-mysql-replication-delay command in the Percona Toolkit for MySQLmonitor replication. It runs every few minutes through Nagios. We also have a custom script that runs through cron that checks for long running requests. It basically parses the output of the "SHOW FULL PROCESSLIST" command and sends an email if the request runs for a long period of time along with the username of the user executing it and a command to kill the request if we decide what we need.
With these checks, we had very little read-replica issue.
source to share