Backups, etcd and etcdtool

Etcd is a fine database for coordination and configuration data but, at least for backup and restore, there's some design assumptions made that aren't a good fit for a hosted service.

That's why we were rather pleased to come across the etcdtool utility. It isn't part of the etcd distribution but the work of Michael Persson and can be found at https://github.com/mickep76/etcdtool.

The prime motivation behind etcdtool was to create a version independent backup and restore utility for etcd, but it has many other capabilities which could make it an essential for etcd users who may want to automate and validate populating their etcd deployments. We'll start with looking at moving a tree from one etcd deployment on Compose to another using etcdtool.

Installing

Before we can do anything, we'll need to install it. The README file for it has details on how to build it from source and how to install on Mac OS X using Homebrew, and on the Releases page, there are builds for Linux, Mac OS X, an RPM and source code. As we're running Mac OS X and Homebrew we went for the simplest install:

brew tap mickep76/funk-gnarge  
brew install etcdtool  

Commanding the line

And we are ready to go. Before we backup, let's use etcdtool to take a look at our data:

$  etcdtool --ca ./etcdcert.crt --peers https://host.1.dblayer.com:10068,https://host.2.dblayer.com:10051 -u root tree /
Password:  
/
└── config/
    └── server-15/
        └── database

2 directories, 1 dirs  

Let's step through the command and its parameters at this point. The -ca ./etcdcert.crt tells the tool to use the self-signed certificate file Compose makes available on the etcd console - see this article for more about that. The next parameter, --peers https://host.1.dblayer.com:10068,https://host.2.dblayer.com:10051 tells etcdtool where to find the cluster's nodes in the same style as the etcdctl command does. Compose users will find that information on the Overview tab of their etcd deployments console. They'll also find the username and password for the etcd user on that page – usually root – which goes into the user parameter next - -u root. Ok, that's the boilerplate parameters for connecting covered. The last part of our command is the subcommand to run. In this case tree / generates a tree view of the etcd hierarchy of keys, but doesn't show values.

When we hit return, we're prompted for the password which goes with the "root" user which was displayed on the Compose console. Once entered we get our tree view. The command is pretty clumsy to work with so let's make things more readable and alias it.

$ alias etcdtool-s1="etcdtool --ca ./etcdcert.crt --peers https://host.1.dblayer.com:10068,https://host.2.dblayer.com:10051 -u root"

Now we can just say etcdtool-s1 for that server.

Rare exports

To export the contents of etcd, we use the export command and tell it what directory we want exported:

$ etcdtool-s1 export /                                                          
Password:  
{
  "config": {
    "server-15": {
      "database": "mongodb"
    }
  }
}
n  

Sending this to stdout this very useful, but it does show us what a version independent export looks like. Don't try and redirect the output; that password prompt will just confuse things. We need an option to send the output to a file. All the subcommands support --help to display flags and options and don't need a connection, so we can do this:

$ etcdtool export --help                                                        

NAME:  
   etcdtool export - export a directory

USAGE:  
   etcdtool export [command options] [arguments...]

OPTIONS:  
   --sort, -s       returns result in sorted order
   --format, -f "JSON"  Data serialization format YAML, TOML or JSON [$ETCDTOOL_FORMAT]
   --output, -o     Output file
$

And see we can specify an output file and format. We'll stick with JSON for now, but we'll send the output to export.json by running:

etcdtool-s1 export / -o export.json

As we have a JSON file exported, we can now look at importing the data into another etcd deployment. And here's one I prepared earlier.

alias etcdtool-s2="etcdtool --ca ./server2-etcdcert.crt --peers https://host.1.dblayer.com:10181,https://host.2.dblayer.com:101422 -u root"  

As is obvious this other server has a different certificate and ports. We’ll proceed by supplying the import command with the JSON data file and the destination in the new etcd hierarchy.

$ etcdtool-s2 import / export.json                                              
Password:  
Do you want to overwrite data in directory: / [yes/no]? yes  
$ etcdtool-s2 tree /                                                            
Password:  
/
└── config/
    └── server-15/
        └── database

2 directories, 1 dirs  
$

And we've rebuilt the tree in a new server.

Recovering Compose backups

Compose backups are snapshots of the data in etcd. Currently, from outside the cluster, they can't be restored directly. That's because the etcd developers expect restoration to be carried out by moving files around on the file system.

If you download an etcd backup, you will find all your data is there. We've just downloaded a backup and extracted its contents:

$ tar xvf 2016-04-17_10-52-03_utc.tar                                             
x etcd/  
x etcd/member/  
x etcd/member/snap/  
x etcd/member/snap/0000000000000003-00000000009e394f.snap  
x etcd/member/wal/  
x etcd/member/wal/0000000000000000-0000000000000000.wal  
$   

We can bring this data to life by running a local copy of etcd. Do ensure you are running the same or later version of etcd as is running in your Compose deployment. When we run etcd in this directory like so:

etcd --data-dir=./etcd --force-new-cluster

This starts a single instance of the database locally, which we can now talk to with etcdtool. No parameters are needed for etcdtool as this is a local instance:

$ etcdtool tree /                                                               
/
└── config/
    ├── server-15/
    │   └── database
    └── server15/
        └── test

3 directories, 2 dirs  

We can now export that data as JSON using etcdtool export / -o backupexport.json and use etcdtool import to load it into any other etcd server or deployment.

Editing with...

We've been focussed on the export and import functionality of etcdtool in this article but there's more to etcdtool than that. For example, to quickly edit your etcd data as JSON (or YAML or TOML, the two other supported formats), try etcdtool edit. This takes a directory in the store as a parameter. It then exports that to the required format and starts up an editor where we can make changes. On exiting the editor we are offered the chance to overwrite that directory with our changes. This is very handy for the trickier changes and a good alternative to firing off multiple etcdctl commands.

Finally, it's worth mentioning that etcdtool has the ability to validate JSON schemas on import using a configuration file to make schemas to the tree. You'll find examples of this features use in the Github repository, but, as yet, little documentation.

Wrapping up

etcdtool is simple but powerful and works well with Compose to solve the problem of converting snapshots into re-usable data.