Verify MongoDB Replica Set High Availability with the Step Down Button

Published

Mon­goDB replica sets are engi­neered for high avail­abil­ity. Replica sets con­sist of repli­cated data-nodes, and are pow­er­ful and resilient to net­work inter­rup­tions. We have a new tool that will assist in ver­ifying your appli­ca­tion is equally ready for high availability.

Configuring your Application for High Availability

Mon­goDB dri­vers use auto-discovery when con­nect­ing to hosts. To begin auto-discovery, dri­vers require seed hosts to attempt a connection. These seed hosts are a list of Mon­goDB instances in the replica set spec­i­fied by ‘host:port’. Proper dri­ver con­fig­u­ra­tions include every replica set mem­ber explic­itly. Dif­fer­ent dri­vers require dif­fer­ent con­fig­u­ra­tions, please see your driver’s con­fig­u­ra­tion for Replica Sets. For a list of dri­vers, please see http://docs.mongodb.org/ecosystem/drivers/ and http://docs.mongodb.org/ecosystem/drivers/community-supported-drivers/.

On the Mon­goHQ data­base admin screen, the “Replica Set URI” con­tains the full list of seed hosts:

MongoDB Drivers' Role in High Availability

A ‘step down’ is an event that tran­si­tions a replica set’s Pri­mary server from one Mon­goDB to another. For instance, if HostA is your Pri­mary before a step­down, then HostB will be the Pri­mary afterwards.

When a dri­ver adjusts to this step­down, the dri­ver will go through the process of:

Time to Test

Once you have con­nected to your appli­ca­tion to your data­base data­base, (and hope­fully before pro­duc­tion), we have a new but­ton that will assist with ver­i­fy­ing your appli­ca­tion for high avail­abil­ity. This is the “Step down button”:

You will find the but­ton on your database’s “Admin” page. Click­ing this but­ton will send a com­mand to the pri­mary mem­ber of your replica set and instruct it to step down. What hap­pens at this point is that the other mem­bers of your replica set will elect a new pri­mary mem­ber. Typ­i­cally, step downs takes a very short time, usu­ally between 2 and 5 sec­onds; in extreme con­di­tions, we have seen a replica set take up to 60 sec­onds to recover.

Test­ing your app before pro­duc­tion is an impor­tant step in con­firm­ing your appli­ca­tion is able to han­dle server fail­ures and replica set step downs. Steps for a test are:

  1. Send an application server request (web request)
  2. Press the "Stepdown" button
  3. Send another application server request, and your application should work after stepdown

If your appli­ca­tion responds to step 3, then you are ready to go. Because dif­fer­ent dri­vers behave slightly dif­fer­ently in failover, give your appli­ca­tion a few more attempts to recover. Should your appli­ca­tion not recover, check your logs, and deter­mine if any errors can be res­cued, and con­nec­tions reat­tempted. Check with your driver’s bugs to deter­mine if any bugs exist for your dri­ver version.

*Already running a production database? *If you are run­ning a pro­duc­tion data­base, just cre­ate another replica set for test­ing (we only charge by the day, so you will only be billed for your time used). Load some stag­ing data, and run through the steps with a stag­ing environment.

Need high availability MongoDB? If you haven't yet graduated to high-availability MongoDB, replica sets are one of the nicest features in MongoDB. Check out our Replica Set SSD MongoDB Databases for highly available, maximum performance.

Conquer the Data Layer

Spend your time developing apps, not managing databases.