Getting Started with Compose & RethinkDB

With the launch of RethinkDB on Compose, you may wonder where to get started learning about RethinkDB. The RethinkDB creators have a wealth of tutorials and example content available and it's a great resource to get learning RethinkDB with. There are, though, one or two things that differ between running RethinkDB locally and RethinkDB hosted on Compose and we'd like to pick up on those differences so you can get up to speed with RethinkDB as quickly as possible. We're going to start with one of the more compelling RethinkDB tutorials - Analyzing presidential elections.

Connecting up

The first thing the tutorial asks you to do is install RethinkDB. Well, on Compose, one you've created a deployment that's already been done, but you'll still need to connect to RethinkDB. When you run locally, there's no control over who connects to the database's admin console or driver ports. That makes it very easy to get going but the last thing you'd want to run in production, let alone put up in the cloud. At Compose, your RethinkDB cluster runs on a dedicated virtual private network and the only way to get at that is through an SSH access capsule. We've got a separate article on how to configure SSH access to your database and you'll need to go through those steps – creating a key, registering the key and creating an SSH tunnel – before you can go on. We'll just wait here while...

Importing the dataset

Ah, excellent. You're back and you should have the RethinkDB user interface running in your browser now. The first part of the tutorial talks about importing some raw data into the "test" database of RethinkDB. We don't put a "test" database instance into the default Compose RethinkDB, so we'll have to create that first. Click on Tables in the top menu bar of the RethinkDB UI and at the top of the table (of tables) display will be buttons to Add Database and Add Table. Click on 'Add Database' and enter test as your new database name. Now we have a database, lets move on.

The tutorial assumes you are running locally and have access to the rethinkdb command, specifically the rethinkdb import command, and that you can download some files from GitHub. Well, we don't have access to the rethinkdb command (unless you install RethinkDB locally which could end up being confusing) so we'll do all the importing through the Data Explorer, an interactive console into the RethinkDB system.

Click on the Data Explorer tab at the top of the RethinkDB interface and you'll see the console. Now, if we look at the commands invoked in the tutorial, we have one to download the file:

wget https://raw.github.com/rethinkdb/rethinkdb/next/demos/election/input_polls.json  

and another to import the data:

rethinkdb import -c localhost:28015 --table test.input_polls --pkey uuid -f input_polls.json --format json  

Now, the import command creates its own table, in this case with a primary key (--pkey) of uuid. We need to create that table ourselves. In the Data Explorer, type the command:

r.tableCreate("input_polls", { primaryKey:"uuid" })  

so it looks like this:

Create Table

and then hit the Run button. The r. is the anchor for ReQL commands and the tableCreate function is hopefully obvious in its purpose. It takes a new table name and then a set of parameters, in this case we set primaryKey to "uuid".

Having taken care of creating the table, we now need to import the data from that URL. It so happens that ReQL has a rather useful function http which can retrieve data from a URL. So

r.http("https://raw.github.com/rethinkdb/rethinkdb/next/demos/election/input_polls.json")  

would import that data as a string variable. The r.http function can reformat that data for us though so if we add {resultFormat:"json"} as a second parameter, we'll get a JSON array of values. You can try it in the console without affecting anything; just enter

r.http("https://raw.github.com/rethinkdb/rethinkdb/next/demos/election/input_polls.json", {resultFormat:"json"})  

Click on Run and you'll get a view of the JSON data. If you select Table View you can also see it formatted as a table.

Table View

Ok, but thats not going into the database yet. For that we need to take the result from the r.http() function and pass it to the ReQL insert function which will happily insert the elements of a JSON array as separate documents. The full command is, therefore...

r.table("input_polls").insert(r.http("https://raw.github.com/rethinkdb/rethinkdb/next/demos/election/input_polls.json", {resultFormat:"json"}))  

If you want to see the data you just imported in tabular form, use r.table("input_polls") and click the Table View tab. This time the data's in the database:

Data at rest

The tutorial has two data sets to import though so we'll need to import the other one too with some similar commands:

r.tableCreate("county_stats",{ primaryKey: "uuid" })

r.table("county_stats").insert(r.http("https://raw.github.com/rethinkdb/rethinkdb/next/demos/election/county_stats.json", {resultFormat:"json"}))  

And back to the tutorial

And with that data imported you can move on to the rest of the tutorial where you can discover RethinkDB's powerful join, function chaining, projections, grouping and more.