We do backups differently at Compose and we'd like to explain why. Our preferred approach is to take a rapid snapshot of a database's data on disk. With the range of databases we have though there are exceptions to that preference and that means different ways to load a local database with that data.
Why would you want to load a local database with the contents of your production database? Well, you may want to test new code in an isolated environment, do some analysis without affecting your applications response times or just explore your own data set for anomalies. For some databases, it's also the springboard for rapid export or import into other databases or applications. So we're going to look at all our current databases...
Snapshots and MongoDB
With MongoDB, we use snapshots to create our backups. We do that for a couple of reason. Firstly, there's no downtime in taking a snapshot. The image is swiftly grabbed at a suitable moment and the production databases know nothing about what has happened. The actual backup creation takes place in the background. The last thing you want with a backup is for it to be interrupting your service while trying to protect your service.
Secondly, these snapshots make for a complete image of your database and that means that we don't have dependencies on previous copies of the data or or other silos of information. That makes for a more reliable and coherent backup.
Thirdly, because we have a complete image, it makes it quick to restore the whole database into a new database deployment. We don't even think about trying to reimport data into an existing deployment as that could just complicate things to the point of corrupting data. Instead we create a clean new database deployment with the snapshot data in place and startup. And you're back.
So that's why we take snapshots, but we want to talk about how you can use those snapshots – and other backups – yourself, outside of Compose's restoration options and for that, you'll need a copy of the backup file.
Getting your backup
This process is the same for most of the databases you use on Compose. Head to the Compose Console and select Backups from the sidebar.
This takes us to a screen like this:
You can restore or download from a daily, weekly or monthly backup by selecting one of the rows in the Daily Backups, Weekly Backups or, you guessed it, Monthly Backups section of the table. If you want an up to the minute backup, you can click the Back up now button and it'll create an on-demand backup which will, when complete, be displayed at the top of the table.
On the right hand side of the table, you'll find two icon buttons in each row. The arrow into the box is the one we're interested in here as that is the download button; the other button, the circular arrow, is the restore button which we don't need for this process.
On everything but our original older MongoDB, clicking on the Download button will start the download immediately. The file being downloaded will be a gzipped tar archive though the actual file name may be just tar depending on your browser's interpretation of the download. You can, on Unix like systems, confirm the file type with the
file command (
file downloadedfile). The file will either be named with the timestamp it was created or as
on_demand. To make life more readable from this point on, we're going to call our downloaded file backup.tar.gz. RethinkDB users won't need to do this extraction, but for other databases we now extract the contents of the backup into its own directory. It's best to extract any backup into its own directory as they may contain many, many files.
$ mkdir db $ tar xvCZf db backup.tar.gz
C in the tar command makes tar change into the specified directory before extracting. Now we've downloaded and extracted the backup into a directory called
db, let's go bring up a local instance of the database. That of course depends on the database where the backup came from. We will assume that we've also downloaded and installed a version of the database that matches the version (or later) of what is in use on Compose.
Starting MongoDB Locally
To bring up a local MongoDB database with our data, we just have to run
mongod and set the dbpath option to point at our data directory:
$ mongod --dbpath ./db ...
That's it. The database should be running on localhost:27017 and we should be able to run
mongo to get a shell session with it.
A local PostgreSQL
PostgreSQL backups are built to allow many backups to be restored into the same directory. If we extract a typical backup as above and look in the
db directory we have to go down a few levels to get to the data. In db we find a
[db] ls data [db] cd data
and inside there a backup directory:
[data] ls backup [data] cd backup
which leads to a timestamped directory for when the backup was taken:
[backup] ls 20160120121608 [backup] cd 20160120121608
and within there a snapshot directory:
 ls snapshot  cd snapshot [snapshot] ls README conf data
The snapshot directory is our destination for database data and that
README file contains everything we need to unlock it. If we look in there we find this:
[snapshot] more README This snapshot is meant to be run with the same minor version of Postgresql, which is postgres (PostgreSQL) 9.4.5. To startup a Postgresql environment with this snapshot, run: `postgres -D conf` You can then connect to the db by running: `psql postgres -U focker` To list all your databases run `\list` and to connect to another database, run `\connect compose`. The configuration in the conf has very promiscuous settings, and will allow connections from any user. You will want to modify the configuration files before allowing world access to the db.
There's not a lot to add to that...
postgres -D conf LOG: database system was interrupted; last known up at 2016-01-20 12:16:08 GMT LOG: redo starts at 0/7000028 LOG: consistent recovery state reached at 0/70000F0 LOG: redo done at 0/70000F0 LOG: MultiXact member wraparound protections are now enabled LOG: database system is ready to accept connections
And you are up and running.
A RethinkDB of your own
Rethinkdb's backups are created with the rethinkdb dump command and the tools know how to work with gzipped tar files, which is why you could skip to here; you don't need to unpack the backup. To restore them, you need a running rethinkdb instance which is simple enough:
$ rethinkdb Recursively removing directory /Users/dj/rethinkdb_data/tmp Initializing directory /Users/dj/rethinkdb_data Running rethinkdb 2.2.3-1 (CLANG 7.0.2 (clang-700.1.81))... Running on Darwin 15.3.0 x86_64 Loading data from directory /Users/dj/rethinkdb_data Listening for intracluster connections on port 29015 Listening for client driver connections on port 28015 Listening for administrative HTTP connections on port 8080 Listening on addresses: 127.0.0.1, ::1
Now go to another terminal session and run
rethinkdb import backup.tar.gz:
$ rethinkdb restore backup.tar.gz Unzipping archive file... Done (0 seconds) Importing from directory... [========================================] 100%
If we bring up a browser and look at the RethinkDB console on localhost:8080 we should find all our data there.
Redis to go
Redis backups are a copy of the
dump.rdb file created by the server. Bringing them to life locally is a little more complex as the
redis-server is configured entirely from a
.conf file. We'll give an example here of starting the server using a brew-installed Redis. We've got our data extracted into the
db directory so first we'll
cd db into that directory. Then we want a simple configuration file. The easiest source for that is the one installed with our Redis, in
/usr/local/etc/redis.conf so we'll copy that to the
db directory and edit it:
$ cp /usr/local/etc/redis.conf . $ vi redis.conf
You now need to change the entry in the config file for
dir. It'll typically be pointing at a directory and in our case it reads
dir /usr/local/var/db/redis/. We'll change that to
dir . and save the file. Once saved, we can run the server with our config file.
$ redis-server redis.conf
To check it's running, you should be able to log in with just
redis-cli in another window and enter
scan 0 to get a quick view of some of the keys in the Redis database.
Currently, Elasticsearch is an exception to Compose's downloadable backups feature. Contact firstname.lastname@example.org if you want to assistance with Elasticsearch backups. Backups are still restorable within the Compose system.
We've covered how to use Compose's downloadable backups to bring up a local database with the same data running in your production database. You can use this for local testing, performance runs or just as part of your restoration process.