PostgreSQL and per-connection write consistency settingsPublished
At Compose, we work with many databases and that gives us the opportunity to be exposed to the functionality and the failings of those databases. As we learn to love the features of newer databases, we also sometime wish that those features would find acceptance in the more mature databases we work with. One particular feature from those newer databases that we've wished for in PostgreSQL was per-connection tunable write consistency over a cluster. It turns out we already have it and we're here to show you how you too can make use of it.
What does this do for me?
How you want your cluster to behave can often depend on your application. Take a billing application. You'll want 100% consistency across the cluster, so you'll want to enable synchronous commits across the cluster so that your database waits for everyone to have written the changes. If, though, your application is a fast-moving social application, response time is probably all you care about so you'll prefer high performance over 100% consistency. For that, you can use asynchronous commits across your cluster.
Welcome to the trade-off
Welcome to the trade-off, where you get to decide if you want consistency of data or consistent performance. PostgreSQL starts in the consistency corner because that makes a good predictable, unsurprising default. Now let's look at the tradeoffs...
Tradeoff #1 - Performance
When a PostgreSQL cluster does not require cluster consistency it can work asynchronously. The write is made at the cluster leader and its followers will eventually, usually within a few milliseconds, be sent the updates.
When a PostgreSQL cluster requires consistency then it should work synchronously. The write will be made at the cluster leader which will then send the update on to followers and confirm each has made the write before confirming to the client which initiated the write that it had been successful. The practical difference between the two is that the asynchronous method uses two network hops while the synchronous method needs four network hops.
Tradeoff #2 - Consistency
If a leader fails suddenly, the two approaches give different results. If working asynchronously, then when that failure occurs not all the writes will have been picked up and committed by the followers. How many? Depends on the application and the replication efficiency. Compose's replication will prevent a follower from taking over as leader if it's more than 1MB behind where the leader was, so potentially up to 1MB of writes could be lost if working asynchronously.
In synchronous mode, that doesn't happen. If the leader fails, all the followers are up to date because any confirmed write on the leader has also been confirmed on the followers. Boom. Consistency.
Synchronous behavior makes sense on a billing application where consistency easily wins the tradeoff between performance and consistency. The most important thing for that application is that the data is right. Compare this with a social networking application where the task at hand is to maintain the attention of the user by being as responsive as possible. There the performance wins out with less network hops and less waiting on commits. But those are not the only tradeoffs to think about...
Tradeoff #3 - Outages
How the cluster performs when there's an outage is important. Consider when one or more of the followers goes down for whatever reason. When the commits are being handled asynchronously then the leader will carry on functioning; it doesn't need to wait on the missing followers and can carry on accepting and processing writes. When the followers return to the cluster then they can catch up with the leader. With synchronous replication, if no followers are responding then the leader will have no choice but to hold on to a commit request until a follower returns to the cluster and is able to take and confirm those writes.
Applications typically require their own particular mix of consistency and performance. Unless they are our solidly consistent billing application or our near ephemeral social networking applications, there will be times when some operations need to be synchronous while at other times some operations can be asynchronous. You may not want the system to wait until a social chat message can be committed to all servers but if that social message was triggering a payment, then you'd be prepared to wait.
Those are, of course, decisions for the application developer. Well made decisions about when each is appropriate can really bring out the best in the cluster. For us, the important thing is to let those choices be made by allowing the developer to switch between the synchronous/asynchronous behaviour at an SQL level, either for a connection or a transaction.
In practice - taking control
By default, PostgreSQL ships for consistency. That's controlled by a server setting
synchronous_commit. By default its set to
on but it has three other settings
off setting stops all synchronous committing, even on the local system.
local keeps the synchronous mode for the local system but writes to followers are asynchronous.
remote_write goes slightly further, but not all the way; writes to followers are asynchronous but return when the follower has accepted the write, but not written it to disk.
Given this range of options we come to a choice of behaviors and, given
on is synchronous writes, we'll choose
local for asynchronous network commits while maintaining local synchronous commits.
Now, we'll tell you how to set that in a moment, but imagine that we have set
local for a server. We wondered if it was possible to change the setting of
synchronous_commit on the fly and it turns out not only that you can, but there are two ways to do it. The first is to do it to your connection's session like so:
SET SESSION synchronous_commit TO ON; // Your writes go here
All future writes on the connection session will confirm writes to followers before returning success to the connected client. Unless, of course, you change the setting of
synchronous_commit again. The
SESSION part of the command can be omitted, if desired, as that is the default for the command.
The second way is when you only want to ensure that you get synchronous replication for a single transaction. In a lot of "NoSQL" generation databases, the concept of transactions doesn't exist, but it does in PostgreSQL. In this case, you start the transaction and then set the
synchronous_commit to on before doing the writes for the transaction. The
COMMIT will commit the transaction using whatever setting
synchronous_commit was set to at that point though it's best practice to precede the writes by the setting of the variable first to ensure that other developers understand the writes are not asynchronous.
BEGIN; SET LOCAL synchronous_commit TO ON; // Your writes go here COMMIT;
All the transaction commits will now be confirmed as written to the followers before the database returns success to the connected client.
We asked you to imagine a PostgreSQL system with
synchronous_commit set to
local earlier. To make that a reality on the server side, you actually need to set two server configuration parameters. The other setting is
synchronous_standby_names which comes into its own when the
synchronous_commit is turned on. It determines which followers are eligible for synchronous commits and we are going to set it to
*, a special case which covers all followers. These values would typically be set in the configuration file by adding:
synchronous_commit = local synchronous_standby_names='*'
synchronous_commit to local, we set up a system where local disk commits are still synchronous but commits to networked followers are, by default, asynchronous. Unless, that is, we decide to make those commits synchronous, as we show above.
If you follow the Governor Project, you'll have noticed some recent changes (1, 2) that enable Governor users to test these settings and take control of their consistency.
A week ago, I would have said that consistency tuning to this level was not possible with PostgreSQL. That was when Kurt, a member of the Compose platform team, insisted that PostgreSQL should be able to enable it. He rejected my naysaying and found this in the PostgreSQL documentation:
With the sweet little modification on the configuration file, this has allowed us to give customers the ability to control their consistency or trade it for performance.