We’ve had a lot of interest in our new auto-scaling PostgreSQL offering and lots of queries about various aspects of the service. We’ll be sharing more detailed information in the coming weeks, but we thought we’d address some of those initial queries here. Enjoy.
1: What does auto-scaling mean?
We use the storage size as a basis for the allocation of resources. This means as storage size goes up, we increase the number of IOPs, the amount of RAM and CPU capacity in tandem with that storage size increase. If scaled to a terabyte of storage, you would have access to 100GB of RAM, 60,000 random IOPS and 12 full CPU cores.
Our RAM-to-storage ratio is pretty simple – 1:10. So with 10GB of storage you’d get 1GB of RAM for your deployment. The auto-scaling criteria we use is based on our experience, but if you need to have more control of how you scale, we can arrange manual scaling for you.
2: How do you scale?
Quite simply, we’re focused on vertical scaling. We take a vanilla PostgreSQL setup on big, well-specified servers and use that as a platform for scaling resources vertically. We haven’t discussed how we do this for PostgreSQL, but we have talked about the principles behind it in How We Scale MongoDB.
3: Other services seem to offer more storage for less money. Are they offering the same thing?
No, we use enterprise-grade SSDs with more raw IO capacity than you’d get on an AWS EC2 instance backed by AWS EBS. We’ve built our system for high-performance PostgreSQL users, rather than users who have databases full of cold data and a much smaller set of hot data.
Also, remember that Compose pricing includes failover, backups, upgrades, monitoring and all the other services that many platforms offer as separately prices add-ons.
4: Do you limit connections to the database?
We’re tuning the connection limits while we are in beta, but we don’t scale those limits. We prefer to set that as high as we can without impacting stability, then let people shoot themselves in the foot if they want. If you do run into connection limit issues, let us know and we’ll happily adjust them for you or show you other ways of getting the number of connections you need.
5: Don’t you suffer from Amazon Web Services’ poor storage performance?
No, because we don’t use AWS’ EBS storage. We run our own hardware and use high performance ephemeral SSD arrays on our i2 AWS instances. Our benchmarking of the i2s show their IO performance to be quite good, which is what you’d expect of local SSDs. The physical servers are “old-school” redundant with RAID-10 and dual power supplies.
6: How do you configure the systems for high availability and replication?
Deployments are spread over two physical servers. The write-ahead log (WAL) is then streamed to the slave database and offsite secondary storage. Replication is currently asynchronous, which means there’s a potentially small window of data loss if a whole server goes. In the future, we’re considering letting users opt in to synchronous replication, but we don’t have that option available yet. A future article will have an in-depth look at our configuration.
7: From which datacenters is PostgreSQL on Compose currently available?
We know, and recommend, that people select datacenters as close to their applications as possible. That said, PostgreSQL is currently available in Ashburn, Virginia, near AWS us-east-1, and Dublin, Ireland, near AWS eu-west-1. We’ll be making it available in more locations in the future based on demand.