Meteor, the Oplog & Elastic Deployments

In the previous two posts in this series we looked at what the oplog is and how you can access it elegantly. However, to get a complete idea of what is possible with the oplog, we are going to look at the Meteor framework and how they've used the oplog to address their scaling and performance issues.

Meteor, if you haven't come across it, is a framework and platform for developing responsive, and collaborative applications in JavaScript. Built on Node.js and pulling together many well regarded components to create its stack, Meteor has been showing developers a compelling route to interactive web applications – a screencast gives a quick introduction to the platform. At the heart of Meteor are algorithms to quickly and efficiently get changes from one client screen to the server and to the other clients' screens. Applications in Meteor publish the results of queries to the clients and then continue to send updates to those clients when the results of these queries changed. Behind the scenes, a MongoDB database keeps track of these query result changes by using a "poll-and-diff" algorithm to calculate the changes in results. With a single Meteor server and MongoDB database, the response time on this is fast.

More Meteors

But there comes a point when its necessary to add another Meteor server to the equation to handle more clients. That server connects to the same MongoDB database but now it becomes more reliant on the poll frequency and on which server the change was written to the database from. The server that did write a change would see the change immediately, but the other servers could be waiting up to ten seconds to see the change as they would be polling the server. It worked, but it could be a bit confusing for users. More crucial though was that queries with large result sets could be quite expensive and that expense went up as you added Meteor servers.

The Meteor developers looked for a new approach and found it with Mongo's oplog. Meteor 0.7.0 saw the introduction of the OplogObserver driver which sits alongside the PollAndDiff driver. This new driver uses the stream of changes that come from tailing the oplog and updates an in-memory database in the Meteor server based on 'minimongo', a Mongo-like system originally developed to give Meteor browser clients a database to consult.

At first, the developers used the oplog to update only queries that tested for equivalence between the scalars types of data (strings, numbers, ObjectIDs, booleans and null). Where a query was based on one of these tests, it would be directed to the OplogObserver. In version 0.7.1, the developers added support in the OplogObserver for many of the common queries including all the $ operators in MongoDB (apart from the geolocating ones like $near and the JavaScript evaluating $where) and excluding the trickier limit and skip parameters.

By version 0.7.2, the OplogObserver got support for "limit" queries (where they came with a non-natural "sort") added. The iterated introduction of an oplog-assisted local data cache has allowed the Meteor developers to focus on the big performance gains to be had while leaving other cases to the older, mature polling driver. The details of which queries are and aren't supported are in the first part of the OplogObserveDriver wiki page.

Activating the oplog

Turning Meteor's oplog support on is simple. In development, Meteor will configure and automatically enable the OplogObserverDriver. In production with MongoHQ you need to set an environment variable MONGO_OPLOG_URL to what we previously discussed: the Replica Set URI, with a user and password of a user with oplog access, a database set to local, and the authSource parameter set to the name of the database to which the user belongs. So if your Replica Set URI looks like this:

mongodb://<user>:<password>@candidate.11.mongolayer.com:10240,candidate.0.mongolayer.com:10240/wiktory  

Then your MONGO_OPLOG_URL should look something like this:

mongodb://<user>:<password>@candidate.11.mongolayer.com:10240,candidate.0.mongolayer.com:10240/local?authSource=wiktory  

Also it is surprisingly easy to turn off the oplog support on a per-query database. Adding a parameter of { _disableOplog: true } to a find ensures that particular query doesn't use the oplog driver. This is used during development when confirmation is needed that Meteor's oplog processing isn't causing problems with the results. It is also possible to get some information on which queries are using the oplog using the Meteor facts module.

Meteor and MongoHQ

With the optimization of Meteor for the oplog, MongoHQ's new Elastic Deployments come into their own. Previously, access to the oplog would have required a configuration that cost hundreds of dollars a month. With Elastic Deployments, oplog access is available to small scale users at $18/GB/month. As a Meteor application's use grows, Elastic Deployments scale resources up appropriately for the task at hand. This is done while maintaining oplog access, which is key to Meteor's ability to scale while maintaining near-realtime responsiveness.