Verify MongoDB Replica Set High Availability with the Step Down Button

Mon­goDB replica sets are engi­neered for high avail­abil­ity. Replica sets con­sist of repli­cated data-nodes, and are pow­er­ful and resilient to net­work inter­rup­tions. We have a new tool that will assist in ver­ifying your appli­ca­tion is equally ready for high availability.

Configuring your Application for High Availability

Mon­goDB dri­vers use auto-discovery when con­nect­ing to hosts. To begin auto-discovery, dri­vers require seed hosts to attempt a connection. These seed hosts are a list of Mon­goDB instances in the replica set spec­i­fied by ‘host:port’. Proper dri­ver con­fig­u­ra­tions include every replica set mem­ber explic­itly. Dif­fer­ent dri­vers require dif­fer­ent con­fig­u­ra­tions, please see your driver’s con­fig­u­ra­tion for Replica Sets. For a list of dri­vers, please see http://docs.mongodb.org/ecosystem/drivers/ and http://docs.mongodb.org/ecosystem/drivers/community-supported-drivers/.

On the Mon­goHQ data­base admin screen, the “Replica Set URI” con­tains the full list of seed hosts:

MongoDB Drivers' Role in High Availability

A ‘step down’ is an event that tran­si­tions a replica set’s Pri­mary server from one Mon­goDB to another. For instance, if HostA is your Pri­mary before a step­down, then HostB will be the Pri­mary afterwards.

When a dri­ver adjusts to this step­down, the dri­ver will go through the process of:

Time to Test

Once you have con­nected to your appli­ca­tion to your data­base data­base, (and hope­fully before pro­duc­tion), we have a new but­ton that will assist with ver­i­fy­ing your appli­ca­tion for high avail­abil­ity. This is the “Step down button”:

You will find the but­ton on your database’s “Admin” page. Click­ing this but­ton will send a com­mand to the pri­mary mem­ber of your replica set and instruct it to step down. What hap­pens at this point is that the other mem­bers of your replica set will elect a new pri­mary mem­ber. Typ­i­cally, step downs takes a very short time, usu­ally between 2 and 5 sec­onds; in extreme con­di­tions, we have seen a replica set take up to 60 sec­onds to recover.

Test­ing your app before pro­duc­tion is an impor­tant step in con­firm­ing your appli­ca­tion is able to han­dle server fail­ures and replica set step downs. Steps for a test are:

  1. Send an application server request (web request)
  2. Press the "Stepdown" button
  3. Send another application server request, and your application should work after stepdown

If your appli­ca­tion responds to step 3, then you are ready to go. Because dif­fer­ent dri­vers behave slightly dif­fer­ently in failover, give your appli­ca­tion a few more attempts to recover. Should your appli­ca­tion not recover, check your logs, and deter­mine if any errors can be res­cued, and con­nec­tions reat­tempted. Check with your driver’s bugs to deter­mine if any bugs exist for your dri­ver version.

*Already running a production database? *If you are run­ning a pro­duc­tion data­base, just cre­ate another replica set for test­ing (we only charge by the day, so you will only be billed for your time used). Load some stag­ing data, and run through the steps with a stag­ing environment.

Need high availability MongoDB? If you haven't yet graduated to high-availability MongoDB, replica sets are one of the nicest features in MongoDB. Check out our Replica Set SSD MongoDB Databases for highly available, maximum performance.