Your SQL Schema Is Your JSON API With PostgREST

As front end devs continue to push for simpler data access tooling, a class of API servers is emerging. All of them are trying to serve the front end dev with an integrated query and interaction model versus the piecemeal implementations devs currently wire together themselves. Facebook's GraphQL and Netflix's Falcor are the best known of these new style API servers. There are multiple implementations of GraphQL and Falcor from Facebook, Netflix and the Javascript community. They all leave the data modelling and full implementation to something else. (Read: there is still a lot for devs to do here). But there is one that stands off by itself: PostgREST.

PostgREST is an HTTP server that sits in front of the Postgres database and effectively turns the database into an API. It differs from the other emerging API servers by taking an opinionated stance on the full implementation of your API by being operationally sound and simple to deploy, by delivering built-in security with JSON Web Tokens plus database roles, and by fully embracing the relational model.

Simple to Deploy

While PostgREST isn't a small unix tool, it does embrace the philosophy of DOTADIW (Do One Thing and Do It Well). It's a single executable binary so you don't have to install a runtime or an interpreter. You can just download the executable for your type of machine here. After you unpack it, just run it. It doesn't even have configuration files since it only needs a few command line parameters:

bash-3.2$ ./postgrest

Missing: DB_URL (-a|--anonymous ROLE)

Usage: postgrest DB_URL (-a|--anonymous ROLE) [-s|--schema NAME]  
                 [-p|--port PORT] [-j|--jwt-secret SECRET] [-o|--pool COUNT]
                 [-m|--max-rows COUNT]
  PostgREST 0.3.0.4 / create a REST API to an existing Postgres database

Available options:  
  -h,--help                Show this help text
  DB_URL                   (REQUIRED) database connection string, e.g.
                           postgres://user:pass@host:port/db
  -a,--anonymous ROLE      (REQUIRED) postgres role to use for non-authenticated
                           requests
  -s,--schema NAME         schema to use for API routes (default: "public")
  -p,--port PORT           port number on which to run HTTP
                           server (default: 3000)
  -j,--jwt-secret SECRET   secret used to encrypt and decrypt JWT
                           tokens (default: "secret")
  -o,--pool COUNT          max connections in database pool (default: 10)
  -m,--max-rows COUNT      max rows in response (default: "infinity")

A few of these are pretty straightforward and expected for an API server. The -p for the port configures where the HTTP server listens. The -o configures the connection pool size (yep, it does that for you too).

And, as it tells us, there are only two requirements: the DB_URL and the -a for the "anonymous" role. While the DB_URL is a typical Postgres connection string, it does configure the default role, or user, which has to do with security as do the -a and the -j which we will review next.

JWTs plus Postgres Equals Security Already Done

PostgREST manages client authentication with JSON Web Tokens (JWTs). It securely embeds a database role name in the JWT and on each http connection it sets the Postgres connection to that database role. It uses the secret from the -j command line switch to both encode and decode the JWT.

An example token with the default secret of "secret" contains the following (from jwt.io):

jwt decoded

To utilize this JWT role appropriately the default connection role, which was set in the Postgres DB_URL, uses NOINHERIT plus a SET ROLE cmd to effectively "switch" to the named role in the JWT. If the HTTP connection's JWT isn't authenticated or if there isn't a token presented then this default connection "authenticator" role switches to the "anonymous" role. This is the same "anonymous" role we set with the -a cmd switch on PostgREST startup. So, in the above example JWT the payload has the role "music_lover". In the following code this music_lover role has already been GRANTed the ability to query multiple tables and fully manage ratings:

CREATE ROLE authenticator NOINHERIT LOGIN PASSWORD 'change_this';  
CREATE ROLE music_lover;  
CREATE ROLE anonymous;

GRANT anonymous, music_lover TO authenticator;

GRANT SELECT ON TABLE artists TO music_lover;  
GRANT SELECT ON TABLE sort_types TO music_lover;  
GRANT SELECT ON TABLE sorts TO music_lover;  
GRANT SELECT ON TABLE rating_types TO music_lover;  
GRANT SELECT, INSERT, UPDATE, DELETE ON TABLE ratings TO music_lover;

GRANT SELECT, INSERT ON TABLE users TO anon;  
GRANT EXECUTE ON FUNCTION  
  login(text,text),
  signup(text, text)
  TO anonymous;

This "music_lover" role JWT is returned from a plpgsql function call that performs login which returns a special type of jwt_claims. When PostgREST sees this type it returns the data as a JSON Web Token. The execution of this function and a signup function are typically given to the "anonymous" role. This way the entire signup to login to authentication and authorization process can be accomplished:

CREATE OR REPLACE FUNCTION  
signup(email text, pass text) RETURNS VOID  
AS $$  
  INSERT INTO users (email, pass, role) VALUES
    (signup.email, signup.pass, 'music_lover');
$$ LANGUAGE sql;

DROP TYPE IF EXISTS jwt_claims CASCADE;  
CREATE TYPE jwt_claims AS (role TEXT, email TEXT);

CREATE OR REPLACE FUNCTION  
login(email TEXT, pass TEXT) RETURNS jwt_claims  
  LANGUAGE plpgsql
  AS $$
DECLARE  
  _role NAME;
  result JWT_CLAIMS; 
BEGIN  
  SELECT user_role(email, pass) INTO _role;
  IF _role IS NULL THEN 
    RAISE invalid_password USING message = 'invalid user or password';
  END IF;
  SELECT _role AS role, login.email AS email INTO result;
  RETURN result;
END;  
$$;

All available functions are exposed by PostgREST under the /rpc path. And the "anonymous" role has been given permission to execute the following:

POST /rpc/signup HTTP/1.1  
Host: localhost:3000  
Content-Type: application/json

{"email": "you@example.com",
 "pass": "super_secret"
}

Then you can actually login:

POST /rpc/login HTTP/1.1  
Host: localhost:3000  
Content-Type: application/json

{"email": "you@example.com",
 "pass": "super_secret"
}

The above responds with the JWT:

{
  "token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJlbWFpbCI6InlvdUBleGFtcGxlLmNvbSIsInJvbGUiOiJtdXNpY19sb3ZlciJ9.R-n2C0unp7qJM0GO2zH9CdR0POU98itI8FZPbHrW-5c"
}

The token is then used for all of the other requests. At this point, the "user" is logged in. Which really means they possess a valid token with their "correct" database role name embedded in it.

GET /artists HTTP/1.1  
Host: localhost:3000  
Content-Type: application/json  
Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJlbWFpbCI6InlvdUBleGFtcGxlLmNvbSIsInJvbGUiOiJtdXNpY19sb3ZlciJ9.R-n2C0unp7qJM0GO2zH9CdR0POU98itI8FZPbHrW-5c  
Cache-Control: no-cache

Now that we have a token and are effectively "logged in", we can get to working with our API and data.

Embrace the Relational by Starting With Your Schema

Bad programmers worry about the code. Good programmers worry about data structures and their relationships. -- Linus Torvalds

Maybe the most beautiful thing about PostgREST is that your API begins and ends with the SQL you write. Since PostgREST embraces the relational model so fully, we really need a full schema to show the depth of this. So, we will work with a schema which models a music festival's lineup with the ability to sort said lineup in various ways from alphabetical to the festival's lineup order to ordering by user ratings. All of this data will be stored in a schema instead of being programmatically generated. This means every Artist will be presorted in the data on these multiple dimensions of alpha, lineup, and rating.

Here is the schema:

A Simple Schema

There is also a gist here which contains the sql for this schema too. (Although it has a few more things in it which are referenced previously).

So, now that we have a schema we can take a look. PostgREST transforms HTTP calls into corresponding SQL calls and responds with JSON:

1. Client calls HTTP api endpoint:

    GET /artists HTTP/1.1
    Host: localhost:3000
    Authorization: Bearer <JSON Web Token with client's database role> 

2. The PostgREST server verifies the token, sets the database client role, and translates GET to a SQL SELECT:

    SELECT row_to_json(t)
      FROM (SELECT *
              FROM artists) t

3. The JSON response from Postgres is proxied back to the client :

[
  {
    "name": "Pearl Jam",
    "mb_id": "83b9cbe7-9857-49e2-ab8e-b57b01038103",
    "date_formed": "1990-12-31"
  },
  {
    "name": "Dead & Company (2 Sets)",
    "mb_id": "94f8947c-2d9c-4519-bcf9-6d11a24ad006",
    "date_formed": "2015-01-01"
  },
  {
    "name": "LCD Soundsystem",
    "mb_id": "2aaf7396-6ab8-40f3-9776-a41c42c8e26b",
    "date_formed": "2001-12-31"
  },
  ....

We can also view all of the resources available by querying just the root view:

    GET / HTTP/1.1
    Host: localhost:3000
    Authorization: Bearer <JSON Web Token with client's database role> 

This will return what our currently logged in user can access:

[
  {
    "schema": "public",
    "name": "artists",
    "insertable": true
  },
  {
    "schema": "public",
    "name": "rating_types",
    "insertable": true
  },
  {
    "schema": "public",
    "name": "ratings",
    "insertable": true
  },
  {
    "schema": "public",
    "name": "sort_types",
    "insertable": true
  },
  {
    "schema": "public",
    "name": "sorts",
    "insertable": true
  }
]

The full schema is available. Each one of those "names" above is a table in our schema. We can even review in more detail with an HTTP OPTIONS to get details:

OPTIONS /sorts HTTP/1.1  
Host: localhost:3000  
Authorization: Bearer  <JSON Web Token with client's database role> 

Which responds:

{
  "pkey": [
    "name"
  ],
  "columns": [
    {
      "references": null,
      "default": null,
      "precision": null,
      "updatable": true,
      "schema": "public",
      "name": "name",
      "type": "text",
      "maxLen": null,
      "enum": [],
      "nullable": false,
      "position": 1
    },
    {
      "references": null,
      "default": null,
      "precision": null,
      "updatable": true,
      "schema": "public",
      "name": "mb_id",
      "type": "uuid",
      "maxLen": null,
      "enum": [],
      "nullable": false,
      "position": 2
    },
    {
      "references": null,
      "default": null,
      "precision": null,
      "updatable": true,
      "schema": "public",
      "name": "date_formed",
      "type": "date",
      "maxLen": null,
      "enum": [],
      "nullable": false,
      "position": 3
    }
  ]
}

The full details of our table. Already exposed as metadata for our api.

Let The Client Query The Data It Needs

One of the big "draws" of GraphQL is that the front end developer can query for whatever is needed. Instead of having to consume a canned "resource" and pull out what is wanted which may even take multiple round trips, GraphQL lets a developer declare just the data needed:

{
  artist(id: 3) {
    name,
    date_formed
  }
}

PostgREST lets the developer query too. However, PostgREST's query is a thinly veiled SQL which can be used to "walk" relations and "project" only the fields needed:

GET /sorts?sort_type_name=eq.lineup&select=artists{name, date_formed} HTTP/1.1  
Host: localhost:3000  
Authorization: Bearer  <JSON Web Token with client's database role>  
Content-Type: application/json  
Range-Unit: items  
Range: 0-4  
Cache-Control: no-cache

From the query above, the return values are Artists in the festival's lineup order with only the attributes "name" and "date_formed":

[
  {
    "artists": {
      "name": "Pearl Jam",
      "date_formed": "1990-12-31"
    }
  },
  {
    "artists": {
      "name": "Dead & Company (2 Sets)",
      "date_formed": null
    }
  },
  {
    "artists": {
      "name": "LCD Soundsystem",
      "date_formed": "2001-12-31"
    }
  },
  {
    "artists": {
      "name": "J. Cole",
      "date_formed": "1985-01-28"
    }
  },
  {
    "artists": {
      "name": "Ellie Goulding",
      "date_formed": "1986-12-30"
    }
  }
]

What is nice about the above is that PostgREST automatically uses the schema to join the tables Sorts and Artists. Plus, it only returns the fields that we asked for. Well done!

Learn You A PostgREST For Great Good

With PostgREST and Postgres, almost all of the functionality today's front end developers need is delivered in a smartly designed and well engineered package.

Instead of spending time developing a middle tier, you can focus on designing your data structures. Instead of duplicating code that is a mere shadow of the many, many developer years of code that is an RDBMS such as Postgres, you can spend your time building the best data structures.

You can stop duplicating functionality that is already written and tested.

Go ahead and learn you a PostgREST for great good.
Cheers.