PostgreSQL Tips: Documenting the Database


PostgreSQL comes with a command - COMMENT - that's not widely used but is useful if you want to document what's in your database. In this PostgreSQL Tips we'll show how to get started using it.

As your database grows and data relations become more complex, keeping track of everything you've added in your database can get difficult. To record how data is organized and what components might have been added or changed over time, it's necessary to add some kind of documentation.

Documentation could be written in an external file, for example, but that can quickly become outdated. PostgreSQL has a solution to this problem: the COMMENT command. With it you can add descriptions to various database objects like columns, indices, tables, and functions that can be updated whenever needed.

In this article, we'll show you how to use this command by documenting a database of United States placenames and their geographic coordinates that we've put together. Let's get started ...

Viewing Data and Adding COMMENTs

PostgreSQL's psql interative shell contains a lot of powerful commands to view and manipulate data). \d is one you should get comfortable with since it will show you a list of all visible tables, views, materialized views, sequences and foreign tables. There are also several combinations of \d that you can use to specify whether you want to look at indices, mappings, constraints, etc. In combination with + (e.g. \d+), the command will provide you with and extended view of your objects that include a Description column, which is where the documentation or COMMENT gets written.

The COMMENT command is how we add descriptions of our data to database objects. Don't confuse COMMENT with \* *\ or -- in SQL since these are written inside an SQL file and not visible in the database. COMMENT, on the other hand, is not standard SQL and exclusive to PostgreSQL.

There are a plethora of database objects that we can use the COMMENT command with. The most obvious of them are tables, indices, and columns, but the PostgreSQL documentation has a growing list of 41 object types you can add descriptions to.

A comment, or description, can be added using the following format - here's the PostgreSQL documentation as a reference:

COMMENT ON [DATABASE OBJECT] [object name] IS 'your description'  

You must be the owner of the object or the administrator to use COMMENT, however. Here's what it looks like in action ...

Descriptions of Tables

To view the Description column for the placenames table, run the \d+ command in psql (here's a list of \d commands in psql). Simply running the \d command alone will not show this column so you'll need to add the + to make it visible.

Running \d+ to show the table and its description gives us:

                     List of relations
 Schema |    Name    | Type  | Owner |  Size  | Description
 public | placenames | table | admin | 364 MB |

Since this is a new table, the Description column is empty. Let's sort that out by adding a COMMENT.

COMMENT ON TABLE placenames IS 'A table of placenames with geospatial coordinates';  

Now running \d+ again, we'll see that the Description column is populated with the new description.

                                        List of relations
 Schema |    Name    | Type  | Owner |  Size  |                    Description
 public | placenames | table | admin | 364 MB | A table of placenames with geospatial coordinates

That's basically all it takes to add information describing your table. So, let's also move on to describing the table's columns ...

Description of Columns

Likewise, we can update the description column of each of the table's columns. To view that, we run \d+ but add the table name to it like \d+ placenames giving us:

                                         Table "public.placenames"
      Column      |       Type       | Collation | Nullable | Default | Storage  | Stats target | Description
 feature_id       | bigint           |           |          |         | plain    |              |
 feature_name     | text             |           |          |         | extended |              |
 feature_class    | text             |           |          |         | extended |              |
 state_alpha      | text             |           |          |         | extended |              |
 state_numeric    | integer          |           |          |         | plain    |              |
 county_name      | text             |           |          |         | extended |              |
 county_numeric   | integer          |           |          |         | plain    |              |
 primary_lat_dms  | text             |           |          |         | extended |              |
 primary_long_dms | text             |           |          |         | extended |              |

Extending the table view here, you can see that we also get other columns as well. Adding descriptions for each column is similar to how we added one to the table. But, now we use the following syntax:

COMMENT ON COLUMN placenames.feature_id IS 'ID of the placename';  
COMMENT ON COLUMN placenames.feature_name IS 'The placename';  
COMMENT ON COLUMN placenames.feature_class IS 'The description of the place i.e. lake, house, etc.';  
COMMENT ON COLUMN placenames.state_alpha IS 'Abbreviation of the state';  
COMMENT ON COLUMN placenames.state_numeric IS 'ID of the state';  
COMMENT ON COLUMN placenames.county_name IS 'Name of the county where the placename is located';  
COMMENT ON COLUMN placenames.county_numeric IS 'ID of the county';  
COMMENT ON COLUMN placenames.primary_lat_dms IS 'Latitude Degrees Minutes Seconds';  
COMMENT ON COLUMN placenames.primary_long_dms IS 'Longitude Degrees Minutes Seconds';  

After adding the descriptions, we'll see something like the following running \d+ placenames:

                                               Table "public.placenames"
      Column      |       Type       | Collation | Nullable | Default | Storage  | Stats target |                     Description
 feature_id       | bigint           |           |          |         | plain    |              | ID of the placename
 feature_name     | text             |           |          |         | extended |              | The placename
 feature_class    | text             |           |          |         | extended |              | The description of the place i.e. lake, house, etc.
 state_alpha      | text             |           |          |         | extended |              | Abbreviation of the state
 state_numeric    | integer          |           |          |         | plain    |              | ID of the state
 county_name      | text             |           |          |         | extended |              | Name of the county where the placename is located
 county_numeric   | integer          |           |          |         | plain    |              | ID of the county
 primary_lat_dms  | text             |           |          |         | extended |              | Latitude Degrees Minutes Seconds
 primary_long_dms | text             |           |          |         | extended |              | Longitude Degrees Minutes Seconds

As you can see, all of the columns have a description now attached to them. So unclear column names like primary_lat_dms can be explained to the end user without ambiguity.

Let's now create an index and describe what it's for.

Description of Indices

Indexing the feature_id column and adding a comment to it, unfortunately, will not be shown by just running \d+ placenames. You'll have to use another command \di+ that will show you all the indices of your tables and their descriptions.

To view what the index descriptions look like, let's first create an index and add a comment:

CREATE INDEX placenames_idx ON placenames (feature_id);  
COMMENT ON INDEX placenames_idx IS 'Index on the placenames table feature_id column';  

Now, using the \di+ command, we can view all the indexes for all the tables in the database. Since we only have one index right now, we'll see:

                                               List of relations
 Schema |      Name      | Type  | Owner |   Table    | Size  |                   Description
 public | placenames_idx | index | admin | placenames | 49 MB | Index on the placenames table feature_id column

As you add more indices to your tables, this list will obviously grow. Nonetheless, consult the psql documentation for more information about the various views that are provided by the \d command.

Dumping and Restoring COMMENTs

If you're migrating over your PostgreSQL database using pg_dump, any descriptions made using COMMENT will be safe. They are stored in the dump file as:

-- Name: TABLE placenames; Type: COMMENT; Schema: public; Owner: admin

COMMENT ON TABLE public.placenames IS 'A table of placenames with geospatial coordinates';  

So once you restore your database, you don't have to worry about whether you'll have to rewrite all those descriptions. PostgreSQL has you covered.

Final Comment

Adding descriptions or comments to your database objects will help you immensely as your database grows. If you have a well-documented database, users will be able to understand how you've structured your data, and the margin of error for forgetting why objects have been added, what they are for, and how the data is structured will surely diminish.

attribution Joanna Kosinska

Abdullah Alger
Abdullah Alger is a former University lecturer who likes to dig into code, show people how to use and abuse technology, talk about GIS, and fish when the conditions are right. Coffee is in his DNA. Love this article? Head over to Abdullah Alger’s author page to keep reading.

Conquer the Data Layer

Spend your time developing apps, not managing databases.