Achieving High Availability & the MongoDB PHP Driver

This is a guest blog by Jake Olefsky, founder of Toodledo.com.

At Toodledo.com, we make tools to help people stay organized and productive. Recently, we released "Outlines", a tool enabling hierarchical outlines to organize people in their projects. Our main website runs LAMP on dedicated servers. On this new project, we decided to build Outlines entirely in the cloud. We required a high availability service without maintaining our own servers.

MongoDB is, at the core, a high-availability database. To ensure business availability, the application layer must respond properly. The MongoDB drivers bridge the role of database and application availability. Our language of choice is PHP, so our driver is the 10gen PHP MongoDB driver. We will layout some of the challenges we faced, and how we overcame them with early PHP drivers (1.2 and 1.3 branches). Luckily, going forward, the 1.4 release has solved many of these issues.

When starting the project in late 2012, PHP MongoDB drivers were fairly immature. During failure testing, the database layer recovered quickly, but PHP drivers did not. Driver connectivity would lag as much as 30 minutes before recognizing any state changes in the MongoDB replica set. This was not acceptable. We designed a wrapper class for the drivers, which reduced transitions to a few seconds. The key concepts of MongoDB availability and driver integration are:

Stepdown, the essence of high availability

When the MongoDB replica set has connectivity issues between nodes and a re-vote is triggered, the result may be that the primary host releases the role and becomes a secondary, while another host becomes primary. This event will prevent actions for between 5-10 seconds during the re-vote and role change. These events happen at the database level. The application layer must handle this gracefully.

Reconnecting to MongoDB

Step downs require a reconnection from the MongoDB drivers. When connection parameters contain a named replica set, and a database failover occurs, the stock drivers take too long to catch and resolve the new state of the replica set. Navigating this caching issue requires immediately attempting a new connection to the database, with the replica set parameter removed.

try { new MongoClient("mongodb://user:pass@host1,host2/db, array("replicaSet"=>"myReplicaSet")); } catch(MongoException $e) { new MongoClient("mongodb://user:pass@host,host2/db); }  

Should this reconnection fail, you can cycle through individual hosts, attempting connections to each until failure. In testing, we always resolved connections using this terse procedure.

Achieving write certainty

Good news, we can recover from a failed connection. However, after removing replica set information, we must rediscover the primary and secondary. If connecting to a secondary, reads will work but writes will fail. To handle this gracefully, the application must catch write errors, reconnect to the proper host, and retry the write if possible.

try { $result = $collection->insert($object, $flags); } catch(MongoException $e) { if($e->getCode()==10058 || $e->getCode()==16) { if($this->reconnect()) $result = $collection->insert($object, $flags); } }  

Driver versions

Version 1.2 of the Mongo PHP drivers are terrible. Don’t use them. The main problem encountered was cached connections in a connection pool. The pool reused connections, even if they were invalid. The database was healthy, but the driver’s attempts to read and write were being sent to to invalid connections in the pool. The application grinds to a halt. This bug was fixed in version 1.3.0.

Version 1.3 of the drivers are ok. They still have trouble recognizing when a replica set changes configurations. If using the techniques above, you will be ok. However, one annoying bug causes writes to occasionally fail for no reason. This bug was fixed in version 1.3.7.

Version 1.4 of the drivers are good. The workarounds needed for the 1.3 branch are not necessary. However, it is still a good idea to have a wrapper on the driver to handle write errors, and perform proper application resolution.

If using a PAAS provider, ensure support for the 1.3.7 or 1.4.x driver versions, else switch providers.

Our Code

We have made our MongoDB driver wrapper class available on Github for anyone to use and improve upon.

The Moral of the Story

Upgrade to PHP Mongo 1.4.x, and use a wrapper class to get easier pain free PHP high availability.