Tinkertank: Visualizing MongoDB with JavaScript

TL;DR: We'll give you the basic tools to start visualizing MongoDB activity.

We've been working at MongoHQ's Tinkertank on visualizing MongoDB activity in various ways that fit in with the idea of ubiquitous information. In the process of putting one particular project together, we realized that we had also come up with a way of using the oplog for creating statistics and thought it would be worth sharing it. For this project, you'll want a MongoDB database configured as a member of a replica set so you can have oplog access. MongoHQ's Elastic Deployments make this simple because MongoDB in an Elastic Deployments is already set up as a replica sets and you can select oplog access when creating a new user.

Console collection

What we want to do with this project is count and sample the number of inserts, updates and deletions happening in a particular collection every second. We'll later show you how to simply visualize this on the web. But first let's gather our data. Our application will be written in Node.js for this but the technique is applicable to any capable language. The Mongo-oplog library is able to manage our connection needs, so remember to npm install mongo-oplog if you are following along. We can then start creating our application:

var MongoOplog = require('mongo-oplog');

var MongoHQURL="mongodb://<user>:<password>@<host>:<port>,<host>:<port>/local?authSource=<databasename>"

var oplog = MongoOplog(MongoHQURL, '<databasename>.<collection>').tail();  

The user and password in the MongoHQURL should be for a user with oplog access, while host names and ports can be obtained from the replica set URI (given on the MongoHQ admin page for the database). We then open the oplog, giving the name of the collection we want to monitor. That monitoring will consist, for now, of three counters so lets initialize them:

var insertCount = 0;  
var updateCount = 0;  
var deleteCount = 0;  

Now, Mongo-oplog lets us register for inserts, updates and deletes and when we get one, we'll bump those counters up by one each time.

oplog.on('insert', function (doc) {  
  insertCount = insertCount + 1;
});

oplog.on('update', function (doc) {  
  updateCount = updateCount + 1;
});

oplog.on('delete', function (doc) {  
  deleteCount = deleteCount + 1;
});

We'll want to gather a little history as we monitor the database. Not too much, just enough to help initialize any late coming connections later on. For that we'll create four arrays, one each for inserts, updates and deletes and one for timestamps, and put 16 elements in each of them so they can serve as queues for new values:

var stamps  = [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ];  
var inserts = [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ];  
var updates = [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ];  
var deletes = [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ];  

Now we can create our sampler function. When this runs, it'll grab the various count values and print them to the console:

function sampler() {  
  console.log("i:" + insertCount + " u:" + updateCount + " d:" + deleteCount);

Then it will grab the current time and push that, and the various counts onto the appropriate arrays and shift the queues so they stay holding only 16 elements:

  var now=new Date().getTime();
  stamps.push(now);
  inserts.push(insertCount);
  updates.push(updateCount);
  deletes.push(deleteCount);
  stamps.shift();
  inserts.shift();
  updates.shift();
  deletes.shift();

The final thing it needs to do is clear the counts and set a timer so it's called again in a second.

  insertCount=0;
  updateCount=0;
  deleteCount=0;
  setTimeout(sampler,1000);
}

Now, you may be concerned that the sampler could miss increments of the counts happening while it is running. That isn't a concern though because the asynchronous styling of Node.js means that there is only one thread in play so when it enters the sampler function, other function calls are effectively blocked till it finished.

Of course, this code would do little unless we actually started the sampler, so we end the code doing that:

sampler();

Run this and if all goes well, you should see updated counts for the last second every second for your selected collection.

To the browser

For the next stage, we wanted to get the application to be accessible from a browser and where it could display a visualization of the changing data. For the web server, we'll use express, connect the browser and the server with socket.io and render the chart with smoothie. Lets get all those libraries installed by running npm install express socket.io smoothie in our working directory.

We can now return to the code we created earlier. After we've required mongo-oplog, we want to set up our web server.

var express=require("express");  
var app=express();  
var server = require("http").createServer(app);  
var io=require("socket.io").listen(server);  

The last line there lets socket.io piggy-back on our server configuration allowing it to send its JavaScript files to the browser. But there are other files we will want to send to the browser ourselves. We will be creating a "lights.html" file later, so we need to ask the browser to send that when "/lights" is requested:

app.get('/lights', function(req, res){  
  res.sendfile('lights.html');
});

And the Smoothie charts library has a JavaScript file it needs to get to the browser to do its rendering. We can send that file from where it lives in the node_modules directory:

app.get('/smoothie.js', function(req, res){  
  res.sendfile('node_modules/smoothie/smoothie.js');
});

Now we are ready to put out web server on a port to listen:

server.listen(3000, function() {  
  console.log('Listening on port %d', server.address().port);
});

The next changes take place before the the sampler function. When a client connects, we need to send them the past results in the queue. To capture the Websocket opening we ask for a function to be called when a connection is made:

io.sockets.on('connection', function(socket) {  

and the first thing we do is send them the existing values in the queue as a message labelled "iud". This information will be sent as JSON data to the client:

for(i=0;i<stamps.length;i++) {  
  socket.emit("iud", {
    t:stamps[i],
    i:inserts[i],
    u:updates[i],
    d:deletes[i]
  });
}
});

There's just one change left to make in this server code. In the sampler, when we have our new values and the current time, we need to send all listening clients that information. This means emitting "iud" messages at all the connected clients:

function sampler() {  
  console.log("i:"+insertCount+" u:"+updateCount+" d:"+deleteCount);
  var now=new Date().getTime();
  io.sockets.emit("iud", {
    t:now,
    i:insertCount,
    u:updateCount,
    d:deleteCount
  });

In the browser

The final step, albeit a big one, is to create the HTML page that gets served as lights.html. The first things we need to do are import the smoothie and socket.io JavaScript and create a canvas where smoothie can draw.

<html>  
<body>  
<script type="text/javascript" src="/smoothie.js"></script>  
<script type="text/javascript" src="/socket.io/socket.io.js"></script>  
<canvas id="mycanvas" width="800" height="400"></canvas>  
<script>  

From here on, it’s all JavaScript. First we create a SmoothieChart which encapsulates all the styling of the chart – we want a step chart, white labels and a fixed minimum value of 0. Once its created, we can attach it to the:

var smoothie = new SmoothieChart( {  
  grid: { strokeStyle:'rgb(125, 0, 0)', fillStyle:'rgb(60, 0, 0)',
  lineWidth: 1, millisPerLine: 250, verticalSections: 6, },
  minValue:0,
  interpolation:"step",
  labels: { fillStyle:'rgb(255, 255, 255)' }
} );

Smoothie allows you to create multiple TimeSeries which you can then add the the chart, specifying their style for that particular chart at that point. We are going to make three time series for our inserts, updates and deletes, colour them green, blue and red, add a transparent fill in the same colour and draw the lines three pixels wide:

var inserts = new TimeSeries();  
var updates = new TimeSeries();  
var deletes = new TimeSeries();

smoothie.addTimeSeries(inserts,{  
  strokeStyle:'rgb(0 , 255 , 0)',
  fillStyle:'rgba(0, 255, 0, 0.4)',
  lineWidth:3
});
smoothie.addTimeSeries(updates, {  
  strokeStyle:'rgb(0, 0 , 255)',
  fillStyle:'rgba(0, 0, 255, 0.4)',
  lineWidth:3
});
smoothie.addTimeSeries(deletes, {  
  strokeStyle:'rgb(255, 0, 0)',
  fillStyle:'rgba(255, 0, 0, 0.4)',
  lineWidth:3
});

We now need to attach the chart to the canvas we defined earlier on the page:

smoothie.streamTo(document.getElementById("mycanvas"), 1000);  

The 1000 is a one second delay to allow data to smoothly stream into view. Talking about the data, we need to receive it. For that we create a websocket back to the server and tell the code to react to the "iud" messages the server was emitting. When it gets one, it extracts the time stamp first then updates each TimeSeries with the timestamp and the new value:

var socket = io.connect();  
socket.on("iud", function(data) {  
  var now=data.t;
  inserts.append(now,data.i);
  updates.append(now,data.u);
  deletes.append(now,data.d);
});
</script>  
</body>  
</html>  

We close the script, body and html elements, save the file. Run up the server and connect to http://localhost:3000/lights (if local – otherwise the appropriate hostname) and you should see something like this:

Around half way through the video, all the records are deleted from the monitored collection and the application managing the records starts to rebuild the data. You'll see how the Smoothie chart quickly rescales to capture this event and then gently adjusts itself as the pattern of usage settles back down.

Graphically explicit

What we've covered here is just the basics of putting together a visualization of oplog data. MongoDB users may note that the statistics appear to be available as part of the db.serverStats data, but not quite. ServerStats are server wide, but what we have done is counted inserts, updates and deletes on a specific collection within the database. We could zoom in even more precisely if we wanted to by writing rules in the oplog handlers to filter by particular customer classes say or specific devices.

In a future MongoHQ Tinkertank, we'll take this code and use it as part of an ambient status monitor which lights up a room. Until then, you can find the code on Github – check it out, remember to edit the files to set your database, collection and credentials and run npm install to get all the Node.js packages, then run either node lights-web.js.