AWS - EC2 - MongoDB Replica Time Synchronization Issue - NTP - Replication Lag

We are facing clock drift issues with our MongoDB replica set running on AWS. This seems to have started recently after we added additional data to the dataset, before that we did not notice this problem unless the system was under heavy load. The following error is logged sporadically in the mongod.log file and the system is not under load.

To test this, we isolated a set of machines with the same dataset and did not use our web application, although the error still occurs;

2014-12-12T13: 33: 51.333 + 0000 [rsBackgroundSync] change target sync because the current sync target of the latest OpTime is 12 Dec 13: 32: 42: c which is more than 30 seconds behind member mongo1: 27017 whose last OpTime is 1418391230

From the above, the timestamp shows that one of the members of the mongodb replica set is within a minute. The worst thing we've seen is 12 minutes from sync.

This error, in turn, causes a replication lag and we get notified of this from the Mongo Monitoring Service, although it fixes itself.

The setup is 3 x r3.xlarge

AWS Linux Instances, 1 in each Zone Availability Zone EU-West-1A

. The machines have been configured using Mongo's recommended settings using the Raid array and scripts cloud formation

provided by Mongo. The data is about 4 GB.

We believe the issue is due to synchronization NTP

, by default on AWS Linux Amazon Machine Image the ntpd service is configured to a pool of aws ntp servers hosted on www.pool.ntp.org

.

To try and eliminate this, we set up our own NTP server on AWS that MongoDB servers can sync with. The problem still happened, so we changed the maxpoll and minpoll times for the ntpd service on the mongo machines to sync the time every 16 seconds

with the NTP server, but the error still occurs.

We increased the size of the MongoDB OpLog to see if that changes, but it doesn't.

Anyone else running into this type of problem? Is there something we are missing?

Greetings,

Colin.

ps -ef | grep ntp;

mongodb1
ntp       5163     1  0 Dec11 ?        00:00:00 ntpd -u ntp:ntp -p /var/run/ntpd.pid -g
ec2-user 15865 15839  0 09:31 pts/2    00:00:00 grep ntp

mongodb2
ntp       4834     1  0 Dec11 ?        00:00:00 ntpd -u ntp:ntp -p /var/run/ntpd.pid -g
ec2-user 19056 19029  0 09:31 pts/0    00:00:00 grep ntp

mongodb3
ntp       5795     1  0 Dec11 ?        00:00:00 ntpd -u ntp:ntp -p /var/run/ntpd.pid -g
ec2-user 26199 26173  0 09:31 pts/0    00:00:00 grep ntp

      

cat / etc / ntp.conf;

# For more information about this file, see the man pages
# ntp.conf(5), ntp_acc(5), ntp_auth(5), ntp_clock(5), ntp_misc(5), ntp_mon(5).

driftfile /var/lib/ntp/drift

# Permit time synchronization with our time source, but do not
# permit the source to query or modify the service on this system.
restrict default kod nomodify notrap nopeer noquery
restrict -6 default kod nomodify notrap nopeer noquery

# Permit all access over the loopback interface.  This could
# be tightened as well, but to do so would effect some of
# the administrative functions.
restrict 127.0.0.1
restrict -6 ::1

# Hosts on local network are less restricted.
#restrict 192.168.1.0 mask 255.255.255.0 nomodify notrap

# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
#server 0.amazon.pool.ntp.org iburst dynamic
#server 1.amazon.pool.ntp.org iburst dynamic
#server 2.amazon.pool.ntp.org iburst dynamic
#server 3.amazon.pool.ntp.org iburst dynamic
server time-server.domain.com iburst

#broadcast 192.168.1.255 autokey        # broadcast server
#broadcastclient                        # broadcast client
#broadcast 224.0.1.1 autokey            # multicast server
#multicastclient 224.0.1.1              # multicast client
#manycastserver 239.255.254.254         # manycast server
#manycastclient 239.255.254.254 autokey # manycast client

# Enable public key cryptography.
#crypto

includefile /etc/ntp/crypto/pw

# Key file containing the keys and key identifiers used when operating
# with symmetric key cryptography.
keys /etc/ntp/keys

# Specify the key identifiers which are trusted.
#trustedkey 4 8 42

# Specify the key identifier to use with the ntpdc utility.
#requestkey 8

# Specify the key identifier to use with the ntpq utility.
#controlkey 8

# Enable writing of statistics records.
#statistics clockstats cryptostats loopstats peerstats

# Enable additional logging.
logconfig =clockall =peerall =sysall =syncall

# Listen only on the primary network interface.
interface listen eth0
interface ignore ipv6

      

ntpq -npcrv;

     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
*172.31.14.137   91.*.*.*      3 u  557 1024  377    1.121   -0.264   0.161
associd=0 status=0615 leap_none, sync_ntp, 1 event, clock_sync,
version="ntpd 4.2.6p5@1.2349-o Sat Mar 23 00:37:31 UTC 2013 (1)",
processor="x86_64", system="Linux/3.14.23-22.44.amzn1.x86_64", leap=00,
stratum=4, precision=-23, rootdelay=23.597, rootdisp=109.962,
refid=172.31.14.137,
reftime=d83a757a.175b5fa1  Tue, Dec 16 2014  9:10:18.091,
clock=d83a77a7.82431efa  Tue, Dec 16 2014  9:19:35.508, peer=27361,
tc=10, mintc=3, offset=-0.264, frequency=-13.994, sys_jitter=0.000,
clk_jitter=0.358, clk_wander=0.053

      

+3


source to share


1 answer


After migrating to MongoDB 3 using the WiredTiger storage engine, we no longer see this issue.



+2


source







All Articles