Deploying the Metrics Collector Microservice on Compose

Building your infrastructure the "microservices way" is to avoid establishing large monolithic codebases that eventually become difficult to maintain.

A microservice is simply a small app that does one task very well. With a small set of requirements, a microservice is easier to build and simpler to test. Deploying a message hub (typically a queue or pubsub channel) to broker a list of work that other microservices are to undertake is a common design pattern. The consumer microservices are decoupled from the producer microservices that generate the data, and as long as the broker is fault-tolerant and has a simple scaling path, the system can typically be scaled by adding more microservice workers to deal with the increasing volumes of traffic.

I recently wrote about my Metrics Collector Microservice project, a Node.js app that writes web analytics data to a queue or pubsub channel using Redis, RabbitMQ, or Apache Kafka. From there the data can be directed to other microservices for storage, analytics, or visualisation which simply act upon the stream of data.

In this post, I'll show you how to get started with a Metrics Collector Microservice deployment using services provided by Compose which make it easy to deploy production-grade databases without breaking a sweat.

For the purposes of this exercise, let's assume we want to use RabbitMQ to provide the queue and MongoDB to provide the storage.

schematic diagram

To make it all work, we'll deploy and configure these microservices on Bluemix:

Sign up for Compose

First, sign up for a Compose account. Once signed up, you can deploy multiple services into your account.

Deploy RabbitMQ

1) Sign into your Compose dashboard.

2) Click Create Deployment.

3) Choose RabbitMQ, give your deployment a name, and pick a data center.

Any data center will do, but you may want to keep all of your servers in the same data center for optimal performance. I've chosen the "US Dallas 9" data center hosted by SoftLayer as this is in the same geography as my Bluemix account.

rabbitmq signup screenshot

Once deployed, Compose shows a dashboard that lets you configure your RabbitMQ deployment. Your next task is to create a RabbitMQ user:

4) Click the Create User link.

5) Complete the Add User form and make a note of the username and password you created.

6) Click the Run button.

rabbitmq create user

7) Grant permissions to this user by completing the User Permissions form, putting a .* in both the "read" and "write" columns:

rabbitmq create user

8) Click Run.

9) Make a note of your deployment's Connection Strings.

Click your deployment, then, on the left, click the Overview tab. Scroll down and make a note of the Connection Strings. We'll need them later when we configure our Bluemix microservices.

Deploy MongoDB

1) On the upper left of the screen click Deployments.

2) Click the Create Deployment button again, but this time choose MongoDB:

compose mongodb create

3) Again, give your deployment a name and choose a suitable data center.

4) Once the cluster is ready, create a database:

Click the Create database link, name it mc (metrics collector), and click Run:

compose create database

5) Create a collection inside your database. Name it mc, and click Run:

compose create database

6) On the left, click Users and click the Add User button.

7) Enter a username/password that has read/write access to the database and click Run:

compose create database

8) Note the Mongo URI.

On the left, click Admin and, within the Admin panel, make a note of the Mongo URI field. You'll need this to configure Bluemix instances later.

Deploy the Metrics Collector Microservice

Now that we have RabbitMQ to deal with the queue, we need a microservice to collect the data and push it into RabbitMQ.

1) Deploy Metrics Collector Microservice.

Visit the Metrics Collector Microservice Github page and click the Deploy to Bluemix button. If you have a pre-existing Bluemix account, then enter your credentials when prompted, otherwise you will be guided through the process of signing up for a new account - again a free trial is available.

2) Configure Metrics Collector Microservice.

Now we've got an instanace of the Metrics Collector Microservice on Bluemix, but we also need to tell it:

Do so by setting custom environment variables in the Bluemix. From your Bluemix dashboard, open your new metrics-collector-microservice app. On the left click Environment Variables, then click User Defined and add the following two variables:

bluemix env variables

When you save the values, the app restarts and in the logs you see entries like this:

2016-03-07T11:45:37.031-0000 Queue mode: rabbit_queue  
2016-03-07T11:45:37.138-0000 Connecting to Rabbit MQ server on amqps:*****@sl-us-dal-55-portal.1.dblayer.com:10222/rabbit-mq-metrics  
2016-03-07T11:45:37.343-0000 CDS Labs Metrics Collector Microservice started on port 61596 : Mon Mar 07 2016 11:45:37 GMT+0000 (UTC)  
2016-03-07T11:45:37.472-0000 Connected to RabbitMQ queue 'mcqueue'  

Congratulations! You deployed your first microservice and it is ready to store its data in RabbitMQ queue.

Deploy the Metrics Collector Storage Microservice

1) Deploy Metrics Collector Storage Microservice.

Visit the Metrics Collector Storage Microservice Github page and click the Deploy to Bluemix button. If you have pre-existing Bluemix account, then enter your credentials when prompted, otherwise you will be guided through the process of signing up for a new account.

2) Configure Metrics Collector Storage.

Once deployed, you need to setup the same environment variables (QUEUE_TYPE, RABBITMQ_URL) as you did for the Metrics Collector Microservice. Open your new metrics-collector-storage-microservice app and add the following 2 variables, just like you did in the preceding section:

When you save, the app restarts and outputs some logs:

2016-03-07T12:40:26.371-0000 Queue mode: rabbit_queue  
2016-03-07T12:40:26.382-0000 Database mode: stdout  
2016-03-07T12:40:26.736-0000 Listening on 61566  
2016-03-07T12:40:27.040-0000 Connected to RabbitMQ - mcqueue  

Notice that it says Database mode: stdout. This is the default behaviour of the Storage Microservice; it writes any data it receives to the console.

Configure the Metrics Collector Storage Microservice to write to MongoDB

Now we tell the microservice to write data to the database. To write data to MongoDB:

1) In your metrics-collector-storage-microservice app, add the environment variable: DATABASE_MODE and give it the value mongodb.

bluemix env variables2

2) Attach a MongoDB by Compose Bluemix service to your Storage Microservice app.

In Bluemix, open your Storage Microservice app and click Add a Service or API. Pick MongoDB by Compose and add the host, port, username and password options from your Compose MongoDB credentials.

Bluemix has two types of service:

The MongoDB by Compose Bluemix service is only loosely coupled; it serves just as a structured way of defining the credentials of your Compose MongoDB instance.

bluemix mongodb service

Once the application restages, your logs should look like this:

2016-03-07T15:00:32.961-0000 Queue mode: rabbit_queue  
2016-03-07T15:00:32.973-0000 Database mode: mongodb  
2016-03-07T15:00:33.142-0000 Connecting to MongoDB instance on sl-us-dal-9-portal.1.dblayer.com  
2016-03-07T15:00:33.437-0000 Listening on 61909  
2016-03-07T15:00:33.677-0000 Connected to RabbitMQ - mcqueue  
2016-03-07T15:00:33.884-0000 Connected to MongoDB  

Just add data

Now we have:

All we need now, is to add some data! The Metrics Collector Microservice is a web service with a GET /tracker API endpoint. Any parameters that are sent to the the /tracker endpoint are combined into an object and become the payload that is sent to the queue.

e.g.

curl "http://MYAPP.mybluemix.net/tracker?a=1&b=2&c=x&c=y"

where MYAPP is the subdomain of your Metrics Collector Microservice instance. It stores an object like this:

{ 
  a: 1, 
  b: 2, 
  c: [ 'x', 'y' ], 
  ip: '212.44.33.22' 
}

If you look at the logs your Storage Microservice is generating in the Bluemix dashboard, you see the data arriving and being saved in batches:

2016-03-07T16:25:58.49+0000 [App/0]   OUT { a: 1, b: 2, c: [ 'x', 'y' ], ip: '109.148.153.111' }  
2016-03-07T16:25:59.27+0000 [App/0]   OUT written { documents: 1, total: 1 }  
2016-03-07T16:26:00.32+0000 [App/0]   OUT { a: 1, b: 2, c: [ 'x', 'y' ], ip: '109.148.153.111' }  
2016-03-07T16:26:02.18+0000 [App/0]   OUT { a: 1, b: 2, c: [ 'x', 'y' ], ip: '109.148.153.111' }  
2016-03-07T16:26:09.25+0000 [App/0]   OUT written { documents: 2, total: 3 }  

Instrumenting a website

Now that our infrastructure is built and tested, we can use the Metrics Collector Microservice for its original purpose: to record website metrics. The microservice hosts a small Javascript file (piwik.js) which you can use in your own HTML pages:

<html>  
 <body>
   <!-- Piwik -->
   <script type="text/javascript">
     var _paq = _paq || [];
     _paq.push(['trackPageView']);
     _paq.push(['enableLinkTracking']);
     (function() {
       var u="//MYAPP.mybluemix.net/";
       _paq.push(['setTrackerUrl', u+'tracker']);
       _paq.push(['setSiteId', "mysite"]);
       var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0];
       g.type='text/javascript'; g.async=true; g.defer=true; g.src=u+'piwik.js'; s.parentNode.insertBefore(g,s);
     })();
   </script>
   <!-- End Piwik Code -->
   <h1>Hello World</h1>
 </body>
</html>  

Again, you need to replace MYAPP with the subdomain specific to your Metrics Collector Microservice instance.

Whenever the tracked webpage is viewed, or any links on that page are clicked, the tracker API is called and a document is queued. You can also use the Piwik library to record custom events at any point in the lifecycle of your web page. See the Piwik API for details

The queued objects are consumed by the Storage Microservice and saved to MongoDB where they look like this:

{
  _id: ObjectId("56de9aa3206e251d00434e96"), 
  action_name: "",
  idsite: "mysite",
  rec: 1,
  r: 253487,
  h: 9,
  m: 25,
  s: 51,
  url: "http://localhost:8000/glynn.html",
  id: "772aa0d070215d3b",
  idts: 1448553217,
  idvc: 8,
  idn: 0,
  refts: 0,
  viewts: 1457429152,
  cs: "windows-1252",
  send_image: 0,
  pdf: 1,
  qt: 0,
  realp: 0,
  wma: 0,
  dir: 0,
  fla: 1, 
  ava: 0,
  gears: 0,
  ag: 0,
  cookie: 1,
  res: "1440x900",
  gt_ms: 12,
  type: "pageView",
  ip: "109.148.153.111"
}

Scaling

By default, our Bluemix apps have only one virtual server for each Microservice, but we can increase the number of servers by simply altering a number in the Bluemix dashboard.

bluemix instance count

Adding more Metrics Collector Microservice instances lets us add capacity and resilience into our web-facing data collection app. Adding more Storage Microservices instances allows the work of storing the data to be shared with however many servers we are running. RabbitMQ queues automatically share out the workload between connected clients. If a client crashes, then the work is taken up by the other instances.

Why microservices?

It may seem to be a complicated installation for this relatively simple use-case, but microservices offer several advantages over monolithic applications:

Wrapping it up

We've seen how a microservice approach allows us to decouple the producers of data from the consumers of it. RabbitMQ makes an excellent message hub, storing data in a queue or pubsub channel before consuming microservices deal with it. Each microservice we create is very simple; typically performing one easily-testable task and the nature of RabbbitMQ queues mean that work is fairly distributed around the connected consumer microservice servers.

Although the Metrics Collector Microservice is simple example built to demonstrate the principle of microservices, it points to a flexible, scalable architecture that offers many advantages over monolithic design.