Transporter 0.0.4 - More control

When we released Transporter as open source, we knew it was a work in progress which we'd be evolving. Well, the first evolutionary step has just happened with version 0.0.4 of the Transporter. This update is more of a preview than a release - if you need binaries, then we still recommend you get the 0.0.3 release. The reason we're letting this preview out now is because we're making some breaking changes and Transporter adopters will need to be prepared. But first, what's new and cool in 0.0.4.

RethinkDB enhancements

There's a new RethinkDB source adapter which means you can use RethinkDB as the starting point for your data's journey. Thanks to Andy Lindeman who contributed that and other fixes! The RethinkDB adaptor has also been updated for the new RethinkDB driver and you can now set a timeout on its connections.

SSL comes to the MongoDB Adaptor

Another adaptor that got a timeout option is the MongoDB adaptor. It also got an updated driver library but most importantly, the MongoDB adaptor now has support for SSL connections. The SSL support can only be activated by adding "ssl: {}" into the JavaScript used to configure the connection like so:

Source({name:"localmongo"}).save({name:"mongodb", ssl:{} });  

The connection's certificates will be unverified, but it will allow you to quickly connect to an SSL-enabled MongoDB like our own MongoDB+. If you really need to put in a certificate to verify against, you'll need to add it as the value to a key within that ssl map like so:

Source({name:"localmongo"}).save({name:"mongodb", ssl:{ first:"---- a very long string ----" } });  

Delete operations have been enabled for the MongoDB write adapter too. This allows deletions in other databases to be reflected by transporter in a MongoDB database but do note that for this to work, you should not modify the _id field of the documents.

Other Changes

We've added a JSONlog to Transporters logging system – it's currently used to aid integration. We've currently disabled the InfluxDB adapter as the driver changed in InfluxDB 0.9. We'll be bringing it back though, but no timetable on that.

Developers will find also we now use vendoring with godeps to ensure that the required libraries you need are the libraries you get and use. They are all useful changes, but there's one change all Transporter users need to be aware of...

The Breaking Change

The big breaking change in 0.0.4 is that we've changed the message format. Thats the unit of data that's passed from the source to the transformers and on to the destination or destinations. This is also the document that a transformer gets to work on.

What the Transporter previously sent as a message included a hidden timestamp and operation (insert, delete, update) which was used by the destinations to decide how to save, or act on, the message. The visible part of the message was a JSON map of key/values which represented the data of the document.

The problem is, as we found, that transformers needed to know what the operation was to make a well-informed decision about how to process the data. This meant we had to make that information visible.

So now, in version 0.0.4 and onwards, the message is now comprised of a JSON map with three keys. The data key contains the document data - it's the same as what was previously visible. The op key contains either "insert", "update" or "delete" to represent the operation and finally ts contains a timestamp, representing when that message was introduced into the Transporter.

What does that mean in practice? Let's take an example transformer from Transporter Driving Part One:

module.exports = function(doc) {  
    return _.pick(doc, ["id", "first_name", "last_name", "email"])
}  

For clarity, we'll change the name of the variable passed in to msg like so

module.exports = function(msg) {  

Now where we were able to reform and return the entire document, we need to be more precise. In this case, we only want to work on msg.data:

    msg.data= _.pick(msg.data, ["id", "first_name", "last_name", "email"]);

And now, rather than returning the record, we return the msg that was passed in.

    return msg;
}

You could also make your own document and assign it to msg.data like so:

module.exports = function(msg)  {  
    newdoc=_.pick(msg.data, ["id", "first_name", "last_name", "email"]); 
    msg.data=newdoc;
    return msg;
}  

The advantage of the new arrangement is that transformer scripts will now be able to see a set of metadata previously invisible to them. For example, if we wanted to only copy a particular field over at insert time and never update it we could do:

module.exports = function(msg)  {  
    var newdoc={};
    if(msg.op=="insert") {
        newdoc=_.pick(msg.data, ["id", "first_name", "last_name", "email","registration-date"]) 
    } else if(msg.op=="update") {
        newdoc=_.pick(msg.data, ["id", "first_name", "last_name", "email"]);
    } else {
       newdoc=_.pick(msg.data, ["id"]);
    }
    msg.data=newdoc;
    return msg;
}  

But you will have to change existing scripts. Thats why we've only released source code for 0.0.4, so that developers and system integrators can get an early handle on where we are going with Transporter.