The MongoDB Oplog & Node.js

Imagine if there was a record of every written change made in MongoDB stored as a collection. Imagine the opportunities that this would offer: triggering events, specialized backups, cache population, and more. Well, stop imagining as that's exactly what MongoDB's oplog allows. Now that we've enabled access to the oplog, we figured it was a good time to publish a short series of blog posts to give you further detail about the oplog, how you can make it work for your stack, and how you can use Node.js to tap into it.

What is the oplog, really?

Despite being called a log, the oplog is not a file, it's a capped collection and held within the MongoDB storage engine. Oplogs came into existence to support MongoDB’s replication features. The idea is that once a replica is fully synced with the master it only has to follow the latest entries in the oplog to find out what changes – inserts, updates, and deletions – are being written into the databases of that MongoDB server. The oplog has a finite capacity as a capped collection and when its full it works like a circular queue, replacing the oldest entries with new entries. MongoDB has various default sizes depending on OS and memory size. With MongoHQ’s Elastic Deployments it’s a lot simpler. The oplog starts at 2GB and grows to ensure at least the last twenty-four hours worth of write operations are in the oplog.

To help put this in context, lets take a look at an Elastic Deployment's oplog from the Mongo shell. The first thing you need to do is create a user with permission to access the oplog. You can’t allocate oplog access to an existing user – best practice suggests you isolate these types of roles to specific new users. Instead, you can create a new user through the MongoHQ Elastic Deployments administration dashboard where selecting the "oplogAccess" property will give these users permission to the oplog. Next, log into your database with the new user:

mongo candidate.0.mongolayer.com:10240/wiktory -u <oploguser> -p <oplogpassword>  
>

The oplog doesn’t belong to any particular database within the Mongo server - it belongs to the server's "local" database. So if we switch to that…

> use local

we can now query the oplog. Its name, oplog.rs, is slightly odd as its the oplog for the Replica Set. If we do a find on this collection, we will get quite a few records.

> db.oplog.rs.find()

Down in the oplog

Let's look at some oplog documents in detail to give a feel for the contents. Here's an example entry:

"ts" : Timestamp(1395663575, 1),
"h" : NumberLong("-5872498803080442915"),
"v" : 2,
"op" : "i",
"ns" : "wiktory.items",
"o" : {
  "_id" : ObjectId("533022d70d7e2c31d4490d22"),
  "author" : "JRR Hartley",
  "title" : "Flyfishing"
  }
}

We start with the rather important timestamp...

{
  "ts" : Timestamp(1395663575, 1),

The "ts", or "timestamp" field shows when this operation was performed. The timestamp gives the oplog its temporal structure and by querying based on the timestamp, gives a clear sequence which helps when oplogs are consumed by other applications and MongoDB replicas.

"h" : NumberLong("-5872498803080442915"),

The "h" field is an unique id for the operation. This ensures that each operation is uniquely identified which, given many similar operations can occur at the same time, is not a bad thing.

"v" : 2,

The "v" field is a special "version" number, specifically for the oplog format. Only MongoDB, Inc knows when that will change and, to date, it has only changed once. So, for most purposes, it can be ignored.

"op" : "i",

Now, we are onto the real meat of the oplog. This is the "op" field which indicates what operation took place. Its values can be "i" for "insert", "u" for "update", and "d" for "delete". For most oplog tailing applications, you will only be interested in these three values, but there is also a "c" for commands that affect databases at a high level, "db" which apparently announces the presence of a "database", and "n" for "no-ops", used for changes in the database or collections which don't result in a change in the stored data.

Any of these operations will have been applied to a database or a collection so the next field, "ns" tells us the "namespace" for this operation. In this case, it's the "wiktory" database's "items" collection:

"ns" : "wiktory.items",

The remaining fields in an oplog record vary depending on the operation. With an insert operation, there is an "o" field which contains the entire document that was inserted:

"o" : {
  "_id" : ObjectId("533022d70d7e2c31d4490d22"),
  "author" : "JRR Hartley",
  "title" : "Flyfishing"
}

For an update, there are two fields, "o2" and "o". In the snippet below, an update to a particular document has set a field to true:

"op" : "u",
"ns" : "wiktory.items",
"o2" : {
  "_id" : ObjectId("533022d70d7e2c31d4490d22")
},
"o" : {
  "$set" : {
    "outofprint" : true
  }
}

The "o2" field contains the query part of the update. Don’t expect complex queries here - the query is sufficient to identify the document being updated. Updates on multiple documents are broken down into multiple entries in the oplog to help ensure that the oplog is idempotent - that reapplying the same operation will have the same result every time.

The "o" field contains the update itself using a subset of MongoDB’s $ syntax. Most operations will appear as $set and $unset operations, to help keep the oplog idempotent. For example, a $inc update will appear in the oplog as a $set on the field in question, setting the field to the result of the increment, and a $rename update will appear as the old field name being $unset and the new field name being $set. Keep this in mind and if in doubt, check how any particular update maps to the oplog. That said, if you are only interested in fields being written so you can update a cache, then you only need check the contents of $set. In the example above, we can see the "outofprint" field being set to true.

Finally, when a delete operation is logged an entry appears in the oplog with the "o" field containing the _id of the document that was deleted (and a "b" field which appears to be always set to true).

Querying the oplog

Having covered the contents of the oplog, we now want to make use of it. Turning it into a stream of changes, at a high level, involves first logging into the local database, getting the highest timestamp in the oplog, and using that to create a tailable cursor which tracks the changes.

Here’s some code that does just that using Node.js:

var MongoDB = require('mongodb');

// To track the oplog we need the oplog URL
oplogurl = 'mongodb://<user>:<password>@candidate.11.mongolayer.com:10240,candidate.0.mongolayer.com:10240/local?authSource=wiktory'

// Open the connection to the database
MongoDB.MongoClient.connect(oplogurl, function(err, db) {  
  // Get to oplog collection
  db.collection("oplog.rs", function(err, oplog) {
    // Find the highest timestamp
    oplog.find({}, {
      ts: 1
    }).sort({
      $natural: -1
    }).limit(1).toArray(function(err, data) {
      lastOplogTime = data[0].ts;
      // If there isn't one found, get one from the local clock
      if (lastOplogTime) {
        queryForTime = {
          $gt: lastOplogTime
        };
      } else {
        tstamp = new MongoDB.Timestamp(0, Math.floor(new Date().getTime() / 1000))
        queryForTime = {
          $gt: tstamp
        };
      }
      // Create a cursor for tailing and set it to await data
      cursor = oplog.find({
        ts: queryForTime
      }, {
        tailable: true,
        awaitdata: true,
        oplogReplay: true,
        numberOfRetries: -1
      });
      // Wrap that cursor in a Node Stream
      stream = cursor.stream();

      // And when data arrives at that stream, print it out
      stream.on('data', function(oplogdoc) {
        console.log(oplogdoc);
      });
    });
  });
});

Now, this code is bare-bones and has been put together to show what happens behind the scenes in an oplog tailing application. The oplogurl variable is the critical part of this program. It uses the Replica Set URI which you can get from the database's Admin dashboard:

mongodb://<user>:<password>@candidate.11.mongolayer.com:10240,candidate.0.mongolayer.com:10240/wiktory  

But rather than ending with the database name, after the last / insert local?authSource= to get

mongodb://<user>:<password>@candidate.11.mongolayer.com:10240,candidate.0.mongolayer.com:10240/local?authSource=wiktory  

This change tells the connecting application to connect to the "local" database but authenticate with the "wiktory" database. This is a common pattern with oplog tailing applications and is a more direct way of doing what we did earlier in this article – logging into the database and then switching to use the "local" database.

The drawback with this code is that everything happening in the oplog – every operation on every collection – will be dumped to the console. To make sense of the oplog and to work with maintainable code we'll want to use a library or framework. In the next part of this series, we will look at some for Node.js and other platforms.