How PostgreSQL Fits Compose

A fundamental principle at Compose, when it comes to database selection, is that a database is only as good as the queries it excels at. As we've just launched PostgreSQL, you may wonder why we've added that to our platform – what does it excel at? Given our history, you might also think we're new to relational databases. Nothing could be further from reality though.

We know that if you are tied to one database technology, you lose out on being able to select a database that could be better at running your queries. That's why we offer multiple different databases at Compose, so you can quickly select and deploy the right database for your queries. With PostgreSQL we've added another database – one with powerful SQL queries and structurally rigid schemas – to the selection we offer our customers. We've done that because we know how useful SQL can be.

The database selection process might run like this: If what we want to do is query the fields of large JSON documents to dig through a corpus of information, then MongoDB or RethinkDB could be appropriate. If we actually wanted to do search-engine like queries on those documents though, then Elasticsearch comes to the fore. If we want to aggregate data from many fast changing systems then Redis's in-memory database could well serve as the core of any architecture...

To see where PostgreSQL is a good fit, it's worth knowing we already ran PostgreSQL in-house at Compose. Some years back, when we looked at the queries involved in handling our financial analysis we knew we wanted good, documentable schemas with the ability to join, filter and extract data from the many records that a business like Compose creates every minute of the day, we went with PostgreSQL.

We ship the billing data over from our billings system into the relational database and restructure it so that it's more amenable to SQL queries. Our financial people get to access it through scripts that give them account-centric and customer-centric view of the data. That's then supplemented with Chart.io which gives us the visualisation tools for the database through its SQL query mode.

The finance project began in the Summer of 2013, first with deep data discovery to model and understand the data that has already accumulated. Then, with first models in hand, and after various alternative approaches to the problem such as MongoDB aggregations and map reduces were explored. For this project though PostgreSQL gave the developers a chance to exploit the flexible combination of relational data and powerful queries without demanding they turn the data model on its head or sideways.

With finance reporting demands being an ever-evolving use case, having that flexibility to create new complex queries is essential if the development is to be nimble to respond to those demands. For example, PostgreSQL functions like generate_series make it easy to extract total databases created for any particular day or other granualarity, ready for further processing and analysis. That means that complex questions can often be answered with a single query like this:

SELECT  
    to_char(query_month, 'YYYY-MM-DD') AS "Month",
    SUM(CASE WHEN convert_at < query_month - interval '30 days' THEN 1 ELSE 0 END) AS "Returning",
    SUM(CASE WHEN convert_at > query_month - interval '30 days' THEN 1 ELSE 0 END) AS "Converted",
    SUM(CASE WHEN churn_at < query_month THEN -1 ELSE 0 END) AS "Churned"
FROM (SELECT  
accounts.id AS account_id,  
accounts.name AS account_name,  
query_month,  
MAX(billables.end_at) AS churn_at,  
MIN(billables.start_at) AS convert_at  
    FROM generate_series(to_char(now(), 'YYYY-mm-dd')::date - interval '90 days', now(), interval '1 day') AS query_month
INNER JOIN billables ON billables.end_at > query_month - interval '30 days' AND billables.start_at < query_month  
INNER JOIN accounts ON billables.account_id = accounts.id  
    WHERE billables.amount > 0
    GROUP BY accounts.id, accounts.name, query_month) AS customer_month_attr
GROUP BY query_month  
ORDER BY query_month ASC  

That query lets us know how many accounts have returned, converted or churned over each month so we can track our growth precisely.

As we expanded the range of databases we offered, we knew that we had to bring the same flexibility in choice of appropriate database to our Compose users. PostgreSQL's version 9.4 made itself even more compelling as our go-to for relational databases. It had new replication options which would make it easier and more reliable to deploy as a highly-available database – something we do with all Compose databases. PostgreSQL 9.4 also brought in a binary JSON system which would open up a range of possibilities for the Compose platform and Transporter.

We started work on Compose PostgreSQL at the start of the 9.4 development cycle as part of our plan to expand the range of databases available on the Compose platform. As we've developed it, we've tested out the hosted version with a number of internal projects, including running the back end of the next version of this blog.

Now we have delivered the beta release, we're moving on to the next phase, looking at delivering a rich environment for PostgreSQL users on Compose from the Dashboard and connecting PostgreSQL to our other database offerings through Transporter. We know that with a relational database to hand, a whole set of problems become easier to solve and we're going to make it as easy as possible for you to plug PostgreSQL into your database stack like we have. That's how PostgreSQL fits in at Compose.