TL;DR: We use proxies to enable SSL, High Availability and Whitelisting. Because we care.
Observant users of Compose will have noted that we make extensive use of HAProxy to enable users to connect to their databases. Curious observant users have wondered why we use proxies on all our services, so allow us to explain.
The Compose architecture for database deployments sees every database running on an isolated private network with no external visibility for any of the database or service nodes in the deployment. There is one class of node that can be seen by the outside world and that's the portals, dedicated to managing the transit of traffic from outside to inside the network and keeping your database private. The most common of all the portals is the TCP portal, a proxy based on HAProxy.
The first reason we run a proxy is to enable us to supplement any high availability options on the database behind the proxy by getting the proxy to independently follow the leader in a cluster. There are some databases where this doesn't apply, like ScyllaDB that exposes all its nodes and gets client drivers to work out which node is leader, but in general, the task is managed by the proxy.
When a database node becomes unresponsive, or a monitoring task tells the proxy that a node is unavailable, the proxy looks for a new leader and sends new requests to that node. Ongoing connections will be broken, which is why we advise developers to program defensively when writing their database calls and be prepared to retry when a request times out. The database is still available in this scenario, but, in a similar fashion to a temporary connectivity issue on the network, some connections have gone astray.
There are databases that do SSL natively and there are databases that don't. Let's deal with the latter group first. At Compose, we know SSL is important for developers and businesses to maintain their data privacy. But where a database has no native SSL support, we have to do something and thats where the proxy comes in. Clients can make an SSL connection to the proxy and we terminate the SSL part of the connection at the proxy. The connection will continue, unencrypted, to the database nodes from that point on. Remember though that each Compose database deployment is on its own private isolated SDN managed VLAN, so there's no risk of exposure to the data.
But there's another benefit; isolation. With all the SSL work being done in the proxy, we can easily change security certificates or upgrade the proxy without touching the running database. If there's an SSL vulnerability found and patches are available, all we need to do is upgrade the proxy. That means no database downtime and just a moment taken while the proxy updates before the connections flow again.
This type of SSL proxy is used with MongoDB, Elasticsearch, RethinkDB, ScyllaDB and RabbitMQ. As you can tell, that's not the whole list of Compose databases; there are exceptions.
For PostgreSQL and MySQL, the SSL termination takes place in the database server. It's managed as part of the wire protocol so it's not easy to extract the SSL element from the connection at the proxy. So we route that through to the database. This does mean that SSL vulnerabilities, when discovered and patched, and certificate updates will require database server restarts to put those fixes in place. We are, though, always on the lookout for ways to manage these so that we can minimise interruptions to services. These databases still use proxies, but just for the high availability functionality.
Finally, there's Redis, which has no SSL support. For that, we have the proxy for high availability, but there's no SSL endpoint available. If you want to secure your Redis connections, check out the SSH portal, which will let you connect over a secure tunnel. Your traffic still goes through the proxy to make leader following happen. When a SSL solution for Redis appears that doesn't break existing drivers we'll look at using it.
The last feature the proxies give us is one of our most accessible security features. Whitelisting allows you, the user, to configure which IP addresses or ranges get access to your database. Whitelists aren't usually a feature of database networking so by ensuring that the capability is in the proxy, we can give every database that has a proxy portal the ability to have a whitelist for who accesses it. You'll find the whitelist controls under the Security view for any Compose database.
We've had to craft how you connect to each one of the Compose databases with an eye to their strengths and weaknesses and leveraged proxies to give a consistent experience of availability and encryption where possible. We are always looking to improve how we do this and hope this article's given you some insight on the design decisions we make so that you can just click to create Compose databases and be sure your are getting a production-ready database.
If you have any feedback about this or any other Compose article, drop the Compose Articles team a line at firstname.lastname@example.org. We're happy to hear from you.