How to get backups with the Compose API and Node.js

Published

Automatic backup retrieval is now possible with Compose's API and here we'll show you how to use it from Node.js. If you want to archive your own backups, start here.

The latest addition to the Compose API is new endpoints to list backups for deployments, view details about backups, start on-demand backups and get the information you need to retrieve them. To understand the endpoints, it's probably easiest to show them in use, so we're going to create a command-line Node.js application, backupomat. To make it an easier read, we'll also be doing it in ES6 JavaScript, with promises and some handy modules like node-fetch and yargs.

JavaScript preambles

Let's kick off with the inevitable preamble. Our app, backupomat.js starts with this:

#!/usr/bin/env node
'use strict';  
const yargs = require('yargs');  
const fetch = require('node-fetch');  
const fs = require('fs');  

That will bring in the yargs command-line parsing package and the node-fetch package. You'll want to npm install both of them too.

API essentials

To make use the Compose API, you'll need two things: the base URL for all the endpoints and an API token. The first one is easy, it's https://api.compose.io/2016-07. The 2016-07 is the version n The second one, you get by going to your Compose console, selecting Accounts, then API Tokens and generating a token there. Save it somewhere safe as that token grants the holder the same privileges as your Compose account has. We recommend you at least keep it out of your code in, say for example, an environment variable called COMPOSEAPITOKEN. Let's add a reference to that into our code:

let apibase = "https://api.compose.io/2016-07";  
let apitoken = process.env.COMPOSEAPITOKEN;  
let apiheaders = {  
                "Authorization": "Bearer " + apitoken,
                "Content-Type": "application/json"
            };

The last step here is preparing some headers for the HTTP requests which pass the apitoken to the system and tell the API to deal in JSON results. Now we're ready to start accessing the API.

It starts with a Deployment ID

The key to accessing Compose deployments through the API is knowing the deployment id of the database you want to access. Now, you can find that in a deployment's Settings view but, we're at the API and there is an endpoint just for that: GET 2016-07/deployments. This returns information about all your deployments. Follow the link to read the API documentation. We'll start building a function to retrieve that here:

let showDeployments = () => {  
    fetch(`${apibase}/deployments/`, { headers: apiheaders })

There's a lot packed into two lines here. The let is creating a function with a variable name. The function doesn't take any arguments in this case. The actual function starts by invoking fetch from node-fetch. This package uses promises and takes care of a lot of the fiddly bits around retrieving and posting to APIs and websites. Here, the fetch call is being given thethe "deployments" endpoint to GET in the first argument. The back quoted string enables embedded variables in strings, so the apibase variable will be expanded into that URL. Then the second argument passes the headers we defined earlier as one of the options for the fetch call. The fetch call returns a promise so we add our next step, once we have results, as a .then() function:

        .then(function (res) {
            return res.json();
        })

This bit of code waits for the result to come back and then calls another handy fetch feature that decodes the result into JSON. We can then pick up that result in another .then() wrapped function:

         .then(function (json) {
            let deployments = json["_embedded"].deployments;
            for (let deployment of deployments) {
                console.log(`${deployment.id} ${deployment.type} ${deployment.name}`);
            }
        })

Our function here is being passed the results of the deployments API call. If you look at the documentation you'll see the results are wrapped in an "_embedded" element as part of complying with JSON+HAL specifications. The first thing we do, therefore, is skip right through that and down to the array with the key "deployments".

When we have that, we then iterate through the deployments and print out the id, database type and name for each. And we're done. Well, almost done...

        .catch(function (err) {
            console.log(err);
        });

Adding this to the end will catch errors and log them. Now we're done. If we ran this function we would get output something like this:

55f694344d847d005d000009 rethink Rethinkery  
56aa59a641380c0010000000 mongodb WiredForTigers  
56b9c68525c90d001e000000 mongodb Exemplum  
56fbb5d69a6ac0001e000004 elastic_search latest-elasticsearch  
5702ad373102eb0017000000 postgresql Sweet95  
569fa2e8f3e11a2b4200000f redis raymond-redisston  

Now we can find deployment ids, and we've pretty much covered the archetype for making an API request, so we can move on to looking for those deployment's backups.

There's a lot of backups

You'll find there are quite a few backups associated with any Compose deployment; 7 days of daily backups, 4 weeks of weekly backups and three months of monthly backups. To find out what is available, we need to use the GET 2016-07/get-deployment-backups endpoint. This takes a deployment id as a parameter in the path. Here's the start of our function:

let listBackups = (deploymentid, options) => {  
    fetch(`${apibase}/deployments/${deploymentid}/backups`, { headers: apiheaders })
        .then(function (res) {
            return res.json();
        })

This is similar to the previous fetch but notice we are now embedding a deployment id variable into the URL; thats being passed in to the function. This will return us the JSON document of backups for that deployment which, if you consult the documentation, looks like this:

{
  "_embedded": {
    "backups": [{
      "id":"5854018289d50f424e00030b",
      "deployment_id":"5854017d89d50f424e00002c",
      "name":"test-deployment-2_2017-01-10_19-41-12_utc_daily",
      "type":"daily",
      "status":"complete"
    },
    ...

Here we can see it's another array, with the key backups wrapped in a JSON+HAL _embedded. The fields include the backup id and deployment_id, root name of the backup, type (daily, weekly, monthly or ondemand) and status (either running or complete). We just need to print some of these out so let's do that:

        .then(function (json) {
            let backups = json["_embedded"].backups;
            for (let backup of backups) {
                console.log(`
Backup ID: ${backup.id}  
Type:      ${backup.type}  
Status:    ${backup.status}  
Base Name: ${backup.name}`);  
            }
        })
        .catch(function (err) {
            console.log(err);
        });
}

The back-quoted string in the console.log call is making use of another attribute of ES6 string templates: the ability to have multi-line templates. This code simply loops through so we get a list that looks something like this:

Backup ID: 5892bcd0af0fbe000a000365  
Type:      daily  
Status:    complete  
Base Name: sweet95_2017-02-02_05-00-00_utc_daily

Backup ID: 58941c706d558b000800010f  
Type:      daily  
Status:    complete  
Base Name: sweet95_2017-02-03_06-00-16_utc_daily

Backup ID: 58956e2cebe448000a0003cf  
Type:      daily  
Status:    complete  
Base Name: sweet95_2017-02-04_06-01-16_utc_daily  
...

More Information

There's one thing you can't see in the list of backups and that's the download link for each one. We make sure that you query a single backup for that information. This is done with the 2016-07/get-deployment-backups-id endpoint which adds the backup id to the path parameter to get the singular backup data. Let's make an about function for backups that uses this:

let aboutBackup = (deploymentid, backupid, options) => {  
    fetch(`${apibase}/deployments/${deploymentid}/backups/${backupid}`, { headers: apiheaders })
        .then(function (res) {
            return res.json();
        })
        .then(function (backup) {
            console.log(`Backup ID: ${backup.id}
Type:      ${backup.type}  
Status:    ${backup.status}  
Base Name: ${backup.name}  
Download:  ${backup.download_link}`);  
        })
        .catch(function (err) {
            console.log(err);
        });
}

If we run this we could get something like:

Backup ID: 587af3e260bc1a0008000c42  
Type:      weekly  
Status:    complete  
Base Name: sweet95_2017-01-15_04-00-34_utc_daily  
Download:  https://dblayer-backups-postgresql.s3.amazonaws.com/compose-3/5702ad373102eb0017000000/sweet95_2017-01-15_04-00-34_utc_daily.tar.gz?X-Amz-Algorithm=AWS4-HMAC-SHA256...  

The backup download link is much longer containing credentials and other information to download within a particular time window. We've just shortened it here as it isn't useful or pretty. This command is purely for information. Let's put the endpoint to use in a download command.

Get Back Up

The first part of getting the backup is to get that download link, so the code looks very similar to what we just did:

let getBackup = (deploymentid, backupid, options) => {  
    fetch(`${apibase}/deployments/${deploymentid}/backups/${backupid}`, { headers: apiheaders })
        .then(function (res) {
            return res.json();
        })

What happens next is important though. First up, is checking we did get a download_link.

        .then(function (backup) {
            if (backup.download_link == null) {
                console.log(`API returned no download link for deployment ${backup.download_link}`);
                process.exit(1);
            }

A missing download link is worth checking for. It doesn't mean something is wrong. For example, you may be trying to retrieve an Elasticsearch backup and currently we don't support downloading them. If we do have a download_link though, it's time to retrieve it.

            console.log(`Going to download ${backup.name}`);
            fetch(backup.download_link)
                .then((res) => {
                    var dest = fs.createWriteStream(`${backup.name}.tar.gz`, {
                        highWaterMark: Math.pow(2, 16)
                    });
                    res.body.pipe(dest);
                    dest.on('finish', () => {
                        console.log("Done");
                        process.exit(0);
                    })
                });

There's plenty going on here. After printing out a message that we're going to download, we use fetch to open a connection and start that download. We get the result in the then and at that point create a file to save this new stream of data into. Now we could extract the full name from the download_link URL, but we'll take a short cut. We append .tar.gz to the end of the backup's base name. We also bump the highWaterMark option up so that things stream smoothly. All that's left is to take the result's body and pipe it to the destination file... res.body.pipe(dest).

We could be done at that point, but the program wouldn't exit. So we set up an event, waiting for the destination file to trigger the "finish" and when that happens, we print, we are done, and we exit.

Taking Command

Well, we now have some functions and we really want to make them into a useful set of commands. No problem as we already included the yargs package. Yargs is a super useful command line parser. All we need to do is this:

yargs.version("0.0.1")  
    .command("deployments", "List deployments", {}, (argv) => showDeployments())
    .command("list <deploymentid>", "List deployment backups", {}, (argv) => listBackups(argv.deploymentid))
    .command("get <deploymentid> <backupid>", "Get specific backup", {}, (argv) => getBackup(argv.deploymentid, argv.backupid))
    .command("about <deploymentid> <backupid>", "Get specific backup information", {}, (argv) => aboutBackup(argv.deploymentid, argv.backupid))
    .help()
    .argv;

A quick explanation. Each .command() function defines a command which can have required arguments. Yargs parses the command line looking for the match and then hashes up the command line giving the positional arguments their name in the argument list. All thats left to do is call a function which does the command and pass it the arguments you want. There's descriptions for commands too which lets the .help() function build a help page:

$  ./backupomat.js start --help                         
Commands:  
  deployments                      List deployments
  list <deploymentid>              List deployment backups
  get <deploymentid> <backupid>    Get specific backup
  about <deploymentid> <backupid>  Get specific backup information

Options:  
  --version  Show version number                                       [boolean]
  --help     Show help                                                 [boolean]

And yes we can run this as a script because we put a #! at the start to run it with the local Node.js runtime. All we had to do was chmod u+x it to make it executable.

So, there we have it, for now. A working command for finding and retrieving backups. In the next part, we'll look at creating on-demand backups and downloading them.

You'll find a repository for this code in compose-ex/backupomat on GitHub.


If you have any feedback about this or any other Compose article, drop the Compose Articles team a line at articles@compose.com. We're happy to hear from you.

attribution Pixabay

Dj Walker-Morgan
Dj Walker-Morgan is Compose's resident Content Curator, and has been both a developer and writer since Apples came in II flavors and Commodores had Pets. Love this article? Head over to Dj Walker-Morgan’s author page to keep reading.

Conquer the Data Layer

Spend your time developing apps, not managing databases.