Building better database bridges with the new Compose Transporter

Published

TL;DR: Updated Transporter tool has a quicker way to get started, a completely rebuilt Elasticsearch adaptor, updated MongoDB adaptor, a new PostgreSQL adaptor, and lots of internal engineering improvements.

The Compose Transporter is an open-source project from Compose, designed to bring some agility to the world of heavy-set ETL tools with a compact and focused tool for moving records and documents from one data source, processing it, and delivering it to a destination. The first editions of Transporter powered users' data transfers and our own backend import functions for some databases. Now, we move on to the next phase...

There's a new release of Compose's open source Transporter project, 0.2.1, a release which sets out to strengthen the connections between databases and improve the command-line user experience. In this release, the work has focussed on getting the MongoDB and Elasticsearch adaptors to work better for all versions of MongoDB and Elasticsearch. There's a new adaptor for PostgreSQL, so you can now read or write data to the first Transporter supported SQL database.

The important thing to know about this update is that it is the first fruits of the active redevelopment of the Transporter. In the not-too-distant future, there will be a range of internal and external changes as the Transporter platform is refined to offer an all round better experience. If you want to help shape that future, there's never been a better time to start working with the Transporter.

A Quicker starter

Finding and copying the appropriate parts of configuration files with the Transporter is not needed anymore with the new init command. Run transporter init source-adaptor sink-adaptor and the Transporter will generate the config files to use the named adaptors. Don't know what adaptors are called? transporter about will list the available ones for you. With this addition, the Transporter's executable becomes more self-sufficient with no collateral files needed.

Enhancing Elasticsearch

The combination on MongoDB and Elasticsearch has proved, by far, one of the most popular use cases for the Transporter. People like the idea of moving their document data from MongoDB to Elasticsearch for search, of course, and for report and analysis. With the Transporter, you can get a continuously updated stream of changes in MongoDB and turn those changes into Elasticsearch updates.

But, since the last major Transporter release, Elasticsearch has changed its API in a number of ways and this made it necessary for the developers at Compose to take a long look at how the adaptor was implemented. The challenge was to make it work for not just the current Elasticsearch but all the previous versions. Part of that work involved rewriting the Elasticsearch adaptor and switching to the elastic library which has multiple versions that each target a specific version of Elasticsearch. The developers have made this transparent to Transporter users by automatically selecting the correct Elasticsearch client on connecting to the database.

Managing MongoDB

The MongoDB adaptor has also seen improvements and fixes. Using the Bulk operator now no longer keeps the Transporter running idle and MongoDB ids are tracked correctly so failed queries can be reissued. When using the oplog, a contributed fix has got around the problem that all database changes were being sent through the transporter; a problem that only manifested itself in an unexpected edge case when you had two databases with the same named collection in each.

The most visible change, which is also a breaking change, is that the MongoDB adaptor name has changed from mongo to mongodb in the configuration files. This change is, though, accompanied by a new, simpler set up for SSL connections (ssl: true and an optional cacerts: with an array of filenames for certificates).

These changes come about as result of work done to restructure the internals of the Transporter, uncoupling the adaptors from the database client implementations below them and making it easier to add future adaptors.

Powering PostgreSQL

The newest arrival in the Transporter is the PostgreSQL adaptor. The adaptor can read or write to a PostgreSQL database translating the messages in the Transporter into a template for the columns and values of an SQL table. It sticks to the Transporter philosophy of keeping the adapters simple; to manipulate or reorganize fields from, for example, a complex JSON document you can use the Transporter's Transformer, JavaScript adaptors which plug into the Transporter pipeline and give you a whole language to lean on for your data manipulation.

Also...

Other changes include a new standing documentation site being built in the Wiki at the Transporter Github repository which will be actively tracking the Transporter developments as they arrive. There's also new tutorials incoming to get new users up and running as smoothly as possible.

You'll find the Github repository for Transporter with the latest releases and documentation, and keep reading Compose Articles for the new tutorial series.


If you have any feedback about this or any other Compose article, drop the Compose Articles team a line at articles@compose.com. We're happy to hear from you.

attributionTeresita Garit

Dj Walker-Morgan
Dj Walker-Morgan is Compose's resident Content Curator, and has been both a developer and writer since Apples came in II flavors and Commodores had Pets. Love this article? Head over to Dj Walker-Morgan’s author page to keep reading.

Conquer the Data Layer

Spend your time developing apps, not managing databases.