How to Pool PostgreSQL Connections with PgBouncer

Published

The secret to a robust application is knowing how to manage resources, and your database connections are a precious resource. With PostgreSQL there are no exceptions; the more connections used, the more RAM that's consumed. This is where connection pooling can help. In this article, we'll look at pooling and give you a head start using PgBouncer, a popular connection pooling middleware for PostgreSQL.

Connection pooling is not a new concept to someone who manages a robust application where a multitude of database connections are used at any given time. However, if you're new to connection pooling and new to PostgreSQL for that matter, you might be considering how to effectively manage your database connections. Ultimately, managing connections will increase the performance of your applications. This is where connection pooling comes in.

We'll give you a brief overview of connection pooling and PostgreSQL, and then introduce you to PgBouncer, a popular connection pooling middleware that's designed specifically for PostgreSQL. Then, we'll show you how to configure it and how to start using it with your Compose for PostgreSQL database.

Connection Pooling and PostgreSQL

PostgreSQL connections are forked processes that are created each time a client needs a connection. These are expensive operations because the client has to perform a handshake with your database, which basically comprises opening a connection, performing authentication tasks - where passwords are verified and TLS is established and configuration settings are shared - then closing the connection after it's done.

With each connection, there is a memory cost of around 10 MB so let's say we have 100 or 300 connections, that's 1 to 3 GB of RAM allocated just for connections alone. This doesn't include the amount of working memory your database needs to run memory intensive queries. So, the memory needed to create and run connections and other processes, as you might see, can get out of hand quickly when scaling up. Overall, this will probably degrade your application and database performance by orders of magnitude due to the increased workload and even crash your database.

Pooling takes care of this issue by sharing a pool of connections, or a finite set of connections set by the administrator that can use used and reused by any number of applications. A pool is essentially a wrapper around a database connection, with the internals of that connection between the database and the pool hidden from your applications.

Therefore, your applications don't need to manage the actual connection to the database, they only need to connect to the pool and then close their connection once a task is done; therefore, returning it to the pool and freeing up a connection. You can think of it as a cached pool of connections. However, if all the connections have been used, the pool will either block the connection or return an error if it's programmed to do so until a connection is freed up in the pool.

It's preferable to set limits on the number of connections allowed in a pool. PostgreSQL's default connection limit is set to 100 concurrent connections, which is also the default on Compose for PostgreSQL. Many connection pooling libraries and tools also set connections to 100 by default. Nonetheless, PostgreSQL recommends that you set the pool to a little lower than the actual max connection limit so that you still have a couple direct connections open for database maintenance and monitoring.

Dockerizing and Configuring PgBouncer

PgBouncer is a connection pooling middleware for PostgreSQL. Unlike other connection pooling middleware that exists for PostgreSQL, PgBouncer serves only as a pooling and proxy layer between PostgreSQL and your application. You can download and install it on various platforms - there's even a Homebrew package for macOS - or you can install by creating your own Docker container.

For this example, we're going to create a Docker image and run a container, which has been modified from the version here. That repo uses an older version of PgBouncer so we've updated it. You can get the docker container over on our Compose examples Github repository for Compose examples. Download the Dockerfile and the pgbouncer folder.

However, before building this container, let's look some of the initial set up that we'll have to do first.

Configuring

Make sure that you have downloaded the pgbouncer folder in the Github repository. The folder contains two files that you should be aware of: pgbouncer.ini, userlist.txt. These are configuration files that PgBouncer will use to configure the pool and to authenticate users.

The pgbouncer.ini file is the configuration file where you alias your database connections, set up the connection limits, and set up the SSL/TLS configuration. Since we're using Compose, we can configure PgBouncer using the require SSL mode from the server. We'll come back to that.

In this configuration file, there are a number of settings that have been commented out. Many of these we will not use, but you can try them out to see if you want some of the extra features turned on or fine-tuned. PgBouncer has extensive documentation on each of these settings. A basic configuration that you might use looks similar to this:

[databases]
compose = host=sl-us-south-1-portal.39.dblayer.com port=54289 dbname=compose user=manager password=AAAAAAAAAAAAA  
dev = host=sl-us-south-1-portal.39.dblayer.com port=54289 dbname=myapp user=dev password=BBBBBBBBBBB

[pgbouncer]
listen_addr = *  
listen_port = 6432  
auth_type = md5  
auth_file = /etc/pgbouncer/userlist.txt  
admin_users = app_admin, app_admin2  
server_tls_sslmode = require  
pool_mode = session  
max_client_conn = 100  
default_pool_size = 20  
logfile = /var/log/pgbouncer/pgbouncer.log  
pidfile = /var/log/pgbouncer/pgbouncer.pid  

Let's take a look at what's under [databases] and [pgbouncer] in turn.

PgBouncer Database Settings

We'll assume you have a Compose for PostgreSQL database up and ready to work with.

Under [databases] that's where you create aliases for the databases you've set up, or the databases you want PgBouncer to manage.

[databases]
compose = host=sl-us-south-7-portal.99.dblayer.com port=99999 dbname=compose user=manager password=AAAAAAAAAAAAA  
dev = host=sl-us-south-7-portal.99.dblayer.com port=99999 dbname=myapp user=dev password=BBBBBBBBBBB  

As you can see, we have two databases in our Compose for PostgreSQL deployment. We've created users and passwords for these in our deployment because you don't want to give your administrator password out to this file. It's a lot safer to designate individual usernames and passwords for databases.

With two databases set up, let's look at PgBouncer settings.

PgBouncer Pool Settings

Under [pgbouncer], are the settings that PgBouncer uses for configuring access to the pool, the number of connections allowed, ports to listen to, etc. The settings we've provided are very basic and can be modified at your leisure.

[pgbouncer]
listen_addr = *  
listen_port = 6432  
auth_type = md5  
auth_file = /some/location/userlist.txt  
admin_users = app_admin, app_admin2  
server_tls_sslmode = require  
pool_mode = session  
max_client_conn = 100  
default_pool_size = 20  
logfile = /some/location/pgbouncer.log  
pidfile = /some/location/pgbouncer.pid  

listen_addr is the address where you want to listen for TCP connections. The * indicates all addresses. listen_port refers to the port of the TCP connection; therefore, when you're connecting up your client, it will use that port, and when we connect to the Docker container we need to expose this port.

auth_type and auth_file really go together. auth_type refers to how you want PgBouncer to authenticate users - we don't authenticate against PostgreSQL here, only PgBouncer, using the userlist.txt file. This file contains a username and the MD5 hash of a password and that username. Again, do not put your administrator password here. The format of the text file looks like:

"app1" "md5<MD5 hash password+username>"
"app2" "md5<MD5 hash password+username>"
"app_admin" "md5<MD5 hash password+username>"
"app_admin2" "md5<MD5 hash password+username>"

For the admin_users setting, you can assign users from your userlist.txt as admins who will have access to a special pgbouncer database that provides performance-related information about PgBouncer.

Since you connect over TLS to Compose for PostgreSQL, you'll also need to set server_tls_sslmode to require, which is the same as you have in your connection string.

For the pool_mode, you have three options: transaction, session, and statement. "Session" mode is the default option, for when an application connects to PgBouncer, it gets a connection from the pool, connects to PostgreSQL using PgBouncer's connection, and then the application returns the connection when it's done. "Transaction" mode is not a constant connection between the client and the server, depending on the release of a connection by the application, but it releases the connection once a transaction by the application is finished. "Statement" mode means that the connection is returned after any query is finished, basically, it requires you to auto-commit. Each one of these approaches has its benefits and tradeoffs, and you can read about each one here. Usually, it's preferred to run in session or transaction mode, with transaction mode being preferable in many cases.

For the settings max_client_conn and default_pool_size, the former refers to the number of applications that will make connections and the latter is how many server connections per database. The defaults are set at 100 and 20, respectively. PgBouncer has a formula for determining the pool size and the number of clients that you should set, but the default is usually more than enough.

Building PgBouncer

With the configuration set up, we're ready to build a Docker container image that will run PgBouncer. From the same directory where you downloaded the example Dockerfile and pgbouncer folder, run:

docker build . -t pgbouncer  

This will create an image with PgBouncer installed with the appropriate pgbouncer.ini and userlist.txt files added for us.

Next, let's run the image:

docker run -d --name pgbouncer -p 6432:6432 --mount source=pgbouncer,target=/etc/pgbouncer pgbouncer  

Here, we run the image in the background -d, or you can use -it to view the logs interactively, open port 6432 up which we'll use to connect to PgBouncer, mount the pgbouncer volume to the container, and use the pgbouncer image we created.

Once it's running, you can start making connections to PgBouncer which will interact with PostgreSQL. For example, when PgBouncer is running with the above configuration, we can log in using:

psql -p 6432 -d dev -h 0.0.0.0 -U app_admin  

This will prompt you for the non-hashed password user "app_admin" who will connect to the dev database or the alias we configured in the [databases] section above. We don't need to specify the host in this example since we're running this locally. However, for applications, you'll need to specify the host to connect to since you won't host your apps locally, we hope.

That's it for connecting to the database ... let's look at PgBouncer stats, briefly.

Getting PgBouncer Stats

If you want to view the stats that PgBouncer collects to determine how your application's connections are running, who's connecting, and any other information, then PgBouncer has it's own database that it creates to provide you with this information. Using one of the admins for PgBouncer, you'd connect like:

psql -p 6432 -U app_admin -h 0.0.0.0 pgbouncer  

This will give you a prompt like:

psql (9.6.9, server 1.9.0/bouncer)  
Type "help" for help.

pgbouncer=#  

From here, you can get all of the commands by running SHOW HELP;.

NOTICE:  Console usage  
DETAIL:  
    SHOW HELP|CONFIG|DATABASES|POOLS|CLIENTS|SERVERS|VERSION
    SHOW FDS|SOCKETS|ACTIVE_SOCKETS|LISTS|MEM
    SHOW DNS_HOSTS|DNS_ZONES
    SHOW STATS|STATS_TOTALS|STATS_AVERAGES
    SET key = arg
    RELOAD
    PAUSE [<db>]
    RESUME [<db>]
    DISABLE <db>
    ENABLE <db>
    RECONNECT [<db>]
    KILL <db>
    SUSPEND
    SHUTDOWN

SHOW  

Here's the pools look like running when connecting the dev and pgbouncer databases SHOW POOLS:

 database  |   user    | cl_active | cl_waiting | sv_active | sv_idle | sv_used | sv_tested | sv_login | maxwait | maxwait_us | pool_mode 
-----------+-----------+-----------+------------+-----------+---------+---------+-----------+----------+---------+------------+-----------
 dev       | dev       |         1 |          0 |         0 |       0 |       1 |         0 |        0 |       0 |          0 | session
 pgbouncer | pgbouncer |         1 |          0 |         0 |       0 |       0 |         0 |        0 |       0 |          0 | statement
(2 rows)

And the connected clients SHOW CLIENTS:

type | user  | database  | state  | addr | port | local_addr | local_port |    connect_time     |    request_time     | wait | wait_us | close_needed |      ptr       | link | remote_pid | tls  
------+-------+-----------+--------+------+------+------------+------------+---------------------+---------------------+------+---------+--------------+----------------+------+------------+-----
 C    |    app1   | dev       | active | unix | 6432 | unix       |       6432 | 2018-08-30 07:08:02 | 2018-08-30 07:08:07 |    0 |       0 |            0 | 0x7ffba90035a0 |      |          0 | 
 C    | app_admin | pgbouncer | active | unix | 6432 | unix       |       6432 | 2018-08-30 07:09:22 | 2018-08-30 07:20:26 |  662 |  104416 |            0 | 0x7ffba9003410 |      |          0 | 

Time to Bounce ...

This was an introduction to getting set up with PgBouncer and Compose for PostgreSQL. PgBouncer allows you to easily set up a connection pool that keeps your database performative, while also managing the connections between your applications and PostgreSQL. The overview provided is only an example of what you can do with the default setting so if your applications are connection hungry, or you have a lot of applications using PostgreSQL, you might consider making adjustments, or exploring other options that PgBouncer has to offer.

attribution Luana Azevedo via Unsplash

Abdullah Alger
Abdullah Alger is a former University lecturer who likes to dig into code, show people how to use and abuse technology, talk about GIS, and fish when the conditions are right. Coffee is in his DNA. Love this article? Head over to Abdullah Alger’s author page to keep reading.

Conquer the Data Layer

Spend your time developing apps, not managing databases.