PostgreSQL Tips: Documenting the Database
PublishedPostgreSQL comes with a command - COMMENT
- that's not widely used but is useful if you want to document what's in your database. In this PostgreSQL Tips we'll show how to get started using it.
As your database grows and data relations become more complex, keeping track of everything you've added in your database can get difficult. To record how data is organized and what components might have been added or changed over time, it's necessary to add some kind of documentation.
Documentation could be written in an external file, for example, but that can quickly become outdated. PostgreSQL has a solution to this problem: the COMMENT
command. With it you can add descriptions to various database objects like columns, indices, tables, and functions that can be updated whenever needed.
In this article, we'll show you how to use this command by documenting a database of United States placenames and their geographic coordinates that we've put together. Let's get started ...
Viewing Data and Adding COMMENT
s
PostgreSQL's psql interative shell contains a lot of powerful commands to view and manipulate data). \d
is one you should get comfortable with since it will show you a list of all visible tables, views, materialized views, sequences and foreign tables. There are also several combinations of \d
that you can use to specify whether you want to look at indices, mappings, constraints, etc. In combination with +
(e.g. \d+
), the command will provide you with and extended view of your objects that include a Description
column, which is where the documentation or COMMENT
gets written.
The COMMENT
command is how we add descriptions of our data to database objects. Don't confuse COMMENT
with \* *\
or --
in SQL since these are written inside an SQL file and not visible in the database. COMMENT
, on the other hand, is not standard SQL and exclusive to PostgreSQL.
There are a plethora of database objects that we can use the COMMENT
command with. The most obvious of them are tables, indices, and columns, but the PostgreSQL documentation has a growing list of 41 object types you can add descriptions to.
A comment, or description, can be added using the following format - here's the PostgreSQL documentation as a reference:
COMMENT ON [DATABASE OBJECT] [object name] IS 'your description'
You must be the owner of the object or the administrator to use COMMENT
, however. Here's what it looks like in action ...
Descriptions of Tables
To view the Description
column for the placenames table, run the \d+
command in psql (here's a list of \d
commands in psql). Simply running the \d
command alone will not show this column so you'll need to add the +
to make it visible.
Running \d+
to show the table and its description gives us:
List of relations
Schema | Name | Type | Owner | Size | Description
--------+------------+-------+-------+--------+-------------
public | placenames | table | admin | 364 MB |
Since this is a new table, the Description
column is empty. Let's sort that out by adding a COMMENT
.
COMMENT ON TABLE placenames IS 'A table of placenames with geospatial coordinates';
Now running \d+
again, we'll see that the Description
column is populated with the new description.
List of relations
Schema | Name | Type | Owner | Size | Description
--------+------------+-------+-------+--------+---------------------------------------------------
public | placenames | table | admin | 364 MB | A table of placenames with geospatial coordinates
That's basically all it takes to add information describing your table. So, let's also move on to describing the table's columns ...
Description of Columns
Likewise, we can update the description column of each of the table's columns. To view that, we run \d+
but add the table name to it like \d+ placenames
giving us:
Table "public.placenames"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
------------------+------------------+-----------+----------+---------+----------+--------------+-------------
feature_id | bigint | | | | plain | |
feature_name | text | | | | extended | |
feature_class | text | | | | extended | |
state_alpha | text | | | | extended | |
state_numeric | integer | | | | plain | |
county_name | text | | | | extended | |
county_numeric | integer | | | | plain | |
primary_lat_dms | text | | | | extended | |
primary_long_dms | text | | | | extended | |
...
Extending the table view here, you can see that we also get other columns as well. Adding descriptions for each column is similar to how we added one to the table. But, now we use the following syntax:
COMMENT ON COLUMN placenames.feature_id IS 'ID of the placename';
COMMENT ON COLUMN placenames.feature_name IS 'The placename';
COMMENT ON COLUMN placenames.feature_class IS 'The description of the place i.e. lake, house, etc.';
COMMENT ON COLUMN placenames.state_alpha IS 'Abbreviation of the state';
COMMENT ON COLUMN placenames.state_numeric IS 'ID of the state';
COMMENT ON COLUMN placenames.county_name IS 'Name of the county where the placename is located';
COMMENT ON COLUMN placenames.county_numeric IS 'ID of the county';
COMMENT ON COLUMN placenames.primary_lat_dms IS 'Latitude Degrees Minutes Seconds';
COMMENT ON COLUMN placenames.primary_long_dms IS 'Longitude Degrees Minutes Seconds';
...
After adding the descriptions, we'll see something like the following running \d+ placenames
:
Table "public.placenames"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
------------------+------------------+-----------+----------+---------+----------+--------------+-----------------------------------------------------
feature_id | bigint | | | | plain | | ID of the placename
feature_name | text | | | | extended | | The placename
feature_class | text | | | | extended | | The description of the place i.e. lake, house, etc.
state_alpha | text | | | | extended | | Abbreviation of the state
state_numeric | integer | | | | plain | | ID of the state
county_name | text | | | | extended | | Name of the county where the placename is located
county_numeric | integer | | | | plain | | ID of the county
primary_lat_dms | text | | | | extended | | Latitude Degrees Minutes Seconds
primary_long_dms | text | | | | extended | | Longitude Degrees Minutes Seconds
...
As you can see, all of the columns have a description now attached to them. So unclear column names like primary_lat_dms
can be explained to the end user without ambiguity.
Let's now create an index and describe what it's for.
Description of Indices
Indexing the feature_id
column and adding a comment to it, unfortunately, will not be shown by just running \d+ placenames
. You'll have to use another command \di+
that will show you all the indices of your tables and their descriptions.
To view what the index descriptions look like, let's first create an index and add a comment:
CREATE INDEX placenames_idx ON placenames (feature_id);
COMMENT ON INDEX placenames_idx IS 'Index on the placenames table feature_id column';
Now, using the \di+
command, we can view all the indexes for all the tables in the database. Since we only have one index right now, we'll see:
List of relations
Schema | Name | Type | Owner | Table | Size | Description
--------+----------------+-------+-------+------------+-------+-------------------------------------------------
public | placenames_idx | index | admin | placenames | 49 MB | Index on the placenames table feature_id column
As you add more indices to your tables, this list will obviously grow. Nonetheless, consult the psql documentation for more information about the various views that are provided by the \d
command.
Dumping and Restoring COMMENT
s
If you're migrating over your PostgreSQL database using pg_dump
, any descriptions made using COMMENT
will be safe. They are stored in the dump file as:
--
-- Name: TABLE placenames; Type: COMMENT; Schema: public; Owner: admin
--
COMMENT ON TABLE public.placenames IS 'A table of placenames with geospatial coordinates';
...
So once you restore your database, you don't have to worry about whether you'll have to rewrite all those descriptions. PostgreSQL has you covered.
Final Comment
Adding descriptions or comments to your database objects will help you immensely as your database grows. If you have a well-documented database, users will be able to understand how you've structured your data, and the margin of error for forgetting why objects have been added, what they are for, and how the data is structured will surely diminish.
attribution Joanna Kosinska