Graph Data With MongoDB

Today's Write Stuff article is by Igor Ribeiro Lima. It is a dive into storing graph data and visualizing it in real time using MongoDB, Node.js, Express, Mongoose, Socket.IO and more with a practical example.

In this article, you'll see a way on how to store graph data in a document database. We'll use a graph visualization tool for drawing directed graphs and real-time communication to allow users to edit the graph and see each other's changes.

To provide a foundation for this, let's go through a quick overview about socket.io and mongoose and then get started working with our simple example. My goal is to keep this as simple as possible.

The concepts learned here will help you to represent and store graph data in a Mongo database or in other databases. If you're already comfortable with Graph Concepts, Mongoose and Socket.io, feel free to skip these three sections.

What is a Graph?

A graph is a data structure that consists of a set of nodes and a set of edges that relate the nodes to one another. It's made up of a set of nodes called vertices and a set of lines called edges that connect the nodes. To represent our graph in the database, we'll store links and nodes as objects. Both will be stored as documents in their respective collections.

Mongoose

Mongoose provides a straight-forward, schema-based solution for your application data model. Everything in Mongoose starts with a Schema. Each schema maps to a MongoDB collection and defines the shape of the documents within that collection.

var mongoose = require('mongoose');  
var Schema = mongoose.Schema;

var EdgeSchema = new Schema({  
  id: String,
  source: {
    id: String,
    weight: Number
  },
  target: {
    id: String,
    weight: Number
  }
});

var VertexSchema = new Schema({  
  id: String,
  color: String,
  label: String
});

The permitted SchemaTypes are: String, Number, Date, Buffer, Boolean, Mixed, ObjectId, and Array.

Creating a model: To use our schema definition, we need to convert our EdgeSchema/VertexSchema into a Model we can work with. To do so, we pass it into mongoose.model(modelName, schema):

var Edge = mongoose.model('Edge', EdgeSchema);  
var Vertex = mongoose.model('Vertex', VertexSchema);  

A model is a class which we can use to construct documents. In this case, each document will be an edge/vertex with properties and behaviors as declared in our schema. Let's create a vertex document representing a red vertex we just created on the graph:

var redVertex = new Vertex({ color: '#ff0000', label: 'I am red' });  
console.log(redVertex.color); // '#ff0000'  

Models are fancy constructors compiled from our Schema definitions. Instances of these models represent documents which can be saved and retrieved from our database. All document creation and retrieval from the database is handled by these models.

Saving: After creating a document in this way, saving it to the database is easy:

var Vertex, vertex;  
Vertex = mongoose.model('Vertex', VertexSchema);  
vertex = new Vertex({ color: '#ff0000', label: 'I am red' });  
vertex.save(function (err) {  
  if (err) return handleError(err);
  // saved!
});

Removing: Models have a static remove method available for removing all documents matching conditions.

Vertex.remove({ color: '#ff0000' }, function (err) {  
  if (err) return handleError(err);
  // removed!
});

Updating: Each model has its own update method for modifying documents in the database without retrieving and then saving them in your application. See the API docs for more detail. If you want to update a single document and return it to your application, use findById to retrieve the document, and then update it.

Vertex.findById( vertexId, function (err, vertex) {  
  if (err) return handleError(err);
  // updating document
  vertex.label = node.label;
  vertex.color = node.color;
  vertex.save( function(err) {
    if (err) return handleError(err);
    // updated;
  });
} );

Socket.IO

Socket.IO is a JavaScript library for real-time web applications. It works on every platform, browser or device, focusing equally on reliability and speed. It has two parts: a client side library that runs in the browser; and a server-side library for node.js.

It enables real-time, bidirectional communication between web clients and server. Both components have identical API and are event-driven like node.js.

Using Socket.IO with the Express framework

You have to create a HTTP server to enable real-time communication; it will allow you to emit and receive messages. The socket is the object that handles this communication between web clients and server. The code below creates that server.

var express = require('express'),  
    app     = express(),
    http    = require('http').Server(app),
    io      = require('socket.io')(http);
http.listen(process.env.PORT || 5000, function(){  
  console.log('listening on *:5000');
});

On the server side you get a socket object when a user's client connects. The io class will fire a connection event when that happens allowing you to setup a callback which takes the socket as a parameter.

io.on('connection', function(socket) {  
  console.log('a user connected');
  socket.on('disconnect', function(){
    console.log('user disconnected');
  });
});

In the browser, calling the io function will return a socket for a connection:

define(['io'], function(io) {  
  var socket = io();
});

It's simple to “push” messages from the client to the server and vice versa by just calling the emit function of the socket object. It takes a key as the first parameter, and the message as the second parameter. At the other end, a message will arrive with the key and the message to be interpreted as appropriate.

In our example code we use socket.emit( 'node-added', node ) in the server to send the newly created graph node to the client side.

Here on the server side we catch incoming add-node events, save them as vertices then respond by sending a node-added event.

socket.on('add-node', function( node ) {  
  var vertex = new Vertex( node );
  vertex.save(function (err) {
    socket.emit( 'node-added', node );
  });
});

At the client side we emit the add-node event, but can also listen for node-added events.

socket.emit( 'add-node', node );  
socket.on( 'node-added', function(node) {  
  console.log(node);
} );

You may prefer to get a callback when a client confirms it has received the message. To do this, simply pass a function as the last parameter of .emit.

Server side code

socket.on('add-node', function( node, cb ) {  
  var vertex = new Vertex( node );
  vertex.save(function (err) {
    cb && cb(node);
  });
});

Client side code

socket.emit( 'add-node', node, function(obj) {  
  console.log(node);
} );

If you want other clients listening to be receive events, you can use broadcasting. This means sending a message to everyone else except for the socket that invokes it. To broadcast an emitted event, add a broadcast flag to emit method calls.

socket.on('add-node', function( node ) {  
  var vertex = new Vertex( node );
  vertex.save(function (err) {
    socket.broadcast.emit( 'node-added', node );
  });
});

Our Example

This example uses two modules on the client-side: one for drawing directed graphs and another for storing and fetching data. The graph drawing module uses a graph visualization tool, named Force Editor. This graph tool, which we call ForceView in the code, lets the user position the nodes of a graph in two-dimensional space in a simple, intuitive, and aesthetically pleasing way. The storage module, which we call DBaaS, uses Socket.io to enable realtime, bi-directional communication between web clients and server, where the server is talking to MongoDB.

The modules are set up in a publish/subscribe style. In the same way as a radio works, with a radio station broadcasting (publish) and radio receivers listening (subscribe). The goal is to avoid dependencies between these modules. Rather than talking to other modules directly, each module publishes their messages on a shared "radio station" firing off events on their own channel and each module can also listen to any other channels for events. Take a look at the images below to see which events are being emitted from each of the modules:

DBaaS radio station

graph visualization radio station

There is also a Backbone view acting as a mediator between ForceView and DBaaS. Doing that allows us to decompose everything into usable small pieces and then make those small pieces work well together. This makes the example code easier to understand and if you want to customize it to suit your particular tastes, you can easily change it.

For instance, you could replace the graph visualization by other graph tools like jqPlot, Dracula, ArborJS, sigmajs, RaphaelJS and so forth. Or you can use any realtime database like Appbase, Neo4j, TitanDB, etc. The good news, you just need to change a single file in order to migrate to another library. The image below illustrates the interaction between the View and these two modules.

backbone view

Running the example locally

The entire code is available on Github, just clone the repository:

git clone https://github.com/compose-ex/graph-data-with-mongodb  

Or download the code then execute npm install from a console to install all dependencies.

Now create an environment variable:

COMPOSE_URI=mongodb://example:example@dogen.mongohq.com:10089/graph  

Next, execute node server.js to start the application!

Head to http://localhost:5000 with your browser to see the running application. If you just want to see the application in action there's a demo on the web.

Note that we are sharing database credentials in the code. Feel free to use that while you're learning. After that, create a new one at Compose and use your own credentials.

gif for cloning git repository

gif for showing how to use the example

Igor works as UI Engineer on Avenue Code, and is a writer in his spare time. He enjoys learning new things, working on personal projects and contributing to open source community.

This article is licensed with CC-BY-NC-SA 4.0 by Compose.