Embrace SQL with HugSQL, Clojure and PostgreSQL

By using annotated SQL to auto generate Clojure functions to access PostgreSQL, HugSQL embraces a SQL first approach to development. Instead of relying on an ORM or some other data abstraction for your relational model, with the SQL first approach you work straight from SQL code by annotating SQL statements with function names, inputs, and returns types.

What's great is that you can load your SQL files from anywhere in your project which lets you organize things your way. Then, once you write your SQL, you can call HugSQL's simple macro def-db-fns which will parse the SQL in order to auto generate Clojure functions in whichever namespace you choose. We can best see how this works by building an example app around some existing data.

Setup

To get started we use the ubiquitous Clojure build tool: leiningen. We'll call our app ted in honor of Ted Codd, the father of the relational model:

bash-3.2$ lein new app ted  
Generating a project called ted based on the 'app' template.  
bash-3.2$ tree  
.
└── ted
    ├── CHANGELOG.md
    ├── LICENSE
    ├── README.md
    ├── doc
    │   └── intro.md
    ├── project.clj
    ├── resources
    ├── src
    │   └── ted
    │       └── core.clj
    └── test
        └── ted
            └── core_test.clj

This just created the basic skeleton of a Clojure app. From here we need to pull in some basic dependencies by adding some lines to our ted/project.clj file:

(defproject ted "0.1.0"
  :description "ted -- exploring hugsql"
  :url "http://www.hugsql.org/"
  :dependencies [[org.clojure/clojure "1.8.0"]
                 [org.postgresql/postgresql "9.4.1207"]
                 [com.layerware/hugsql "0.4.5"]]
  :main ^:skip-aot ted.core
  :target-path "target/%s"
  :profiles {:uberjar {:aot :all}})

The most important things added above are in the :dependencies key to include both PostgreSQL and HugSQL. After running lein deps, we have the dependencies needed. Next we create the last setup piece which is the PostgreSQL database.

Creating a deployment of PostgreSQL is straightforward enough. You can create a trial account and then choose to create a new PostgreSQL deployment from Compose.

After that we can seed a database with a schema and data from a recent PostgreSQL article. While HugSQL certainly supports DDL (Data Definition Language) statements we won't use them in this example app for brevity. To download the example output from pg_dump which can be then loaded into your database use the following:

wget https://raw.githubusercontent.com/compose-ex/festivaldata/master/seed_festival.sql

or you can just clone the repo here.

Then load it into your database by sourcing it from within psql:

psql (9.4.4, server 9.4.6)  
SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: off)  
Type "help" for help.

compose=> \i seed_festival.sql 

Assuming you were in the same directory as the file when you started up psql then you should be ready to finally get to the good part of looking at the code.

Create Code

To start, we will create a ted/src/ted/db.clj file for the database config information. This info is passed off to clojure.java.jdbc to handle connecting to PostgreSQL. It is in its own file to namespace it and keep things "clean". It contains:

(ns ted.db)

(def db
  {:classname "org.postgresql.Driver"
   :subprotocol "postgresql"
   :subname "//<db_host_name>:<db_port>/<db_name>"
   :user "music_lover"
   :password "<user_password>"
   :sslmode "require"
   })

Next in the same mindset of "breaking things up and namespacing them" we add ted/src/ted/sql.clj and it contains:

(ns ted.sql
  (:require [hugsql.core :as hugsql]))

(hugsql/def-db-fns "ted/sql/festival.sql")

Basically, it just "wires up" the def-db-fns macro so that when we call it later it will parse the "festival.sql" file and create the fns for us in this ted.sql namespace (it could be another namespace like ted.festival too). So, finally we are to the SQL file:

-- :name artists-all
-- :command :query
-- :result n 
-- :doc select all the artists with all the attributes
SELECT *  
  FROM artists

-- :name artists
-- :command :query
-- :result n
-- :doc select all with cols defined by {:cols [<col_name>...]}
SELECT :i*:cols  
  FROM artists

-- Note the terse style below
-- ":command :query" -> ":?"
-- ":result n" -> ":*" 

-- :name artist-by-id :? :*
-- :doc get artist info by music_brainz id
SELECT *  
  FROM artists
 WHERE mb_id = :id

-- :name artists-by-name-like :? :*
-- :doc use {:name-like "P%"} as the option param to get the P's
SELECT *  
  FROM artists
 WHERE name 
  LIKE :name-like

-- :name new-rating :! :n
-- :doc insert new ratings
INSERT INTO ratings (artist_name, email, rating_type_name, rating)  
VALUES (:artist_name, :email, :rating_type_name, :rating)

-- :name new-rating-type :! :n
-- :doc insert new rating-types
INSERT INTO rating_types (name, description)  
VALUES (:name, :description)

This is where the real "magic" occurs. The SQL comment lines are parsed and transformed into actual Clojure functions by the def-db-fns macro we looked at earlier. Once we execute that, it loads this SQL file and creates all of the following functions ["artists-all", "artists", "artists-by-name-like", "artist-by-id", "new-rating", "new-rating-type"]. What is nice about this is when we require all of this together in our ted.core it will be ready to use. To look at some queries we can use the Clojure REPL(Read Eval Print Loop) which Lisps do very well. Here follows the last code file ted/src/ted/core.clj which really does little more than pull all of the namespaces together:

(ns ted.core
  (:require [ted.db :refer [db]]
            [ted.sql :as sql]
            [clojure.pprint :as pprint]))

(defn -main
  [& args]
  (println "Hello, World!"))

Now that we are done creating code, we can take one last look at the project structure. We created ted/src/ted/db.clj, ted/src/ted/sql.clj, and ted/src/ted/sql/festival.sql. Plus, we changed a little in ted/project.clj and ted/src/ted/core.clj. Really, not that much:

bash-3.2$ tree  
.
├── CHANGELOG.md
├── LICENSE
├── README.md
├── doc
│   └── intro.md
├── project.clj
├── resources
├── src
│   └── ted
│       ├── core.clj
│       ├── db.clj
│       ├── sql
│       │   └── festival.sql
│       └── sql.clj
└── test
    └── ted
        └── core_test.clj

After all of that, once we start the REPL everything will be wired up since the ted/project.clj points at ted/src/ted/core.clj as the main entry point. We are now good to go with an interactive REPL session:

sh-3.2$ lein repl  
nREPL server started on port 51020 on host 127.0.0.1 - nrepl://127.0.0.1:51020  
REPL-y 0.3.7, nREPL 0.2.12  
Clojure 1.8.0  
Java HotSpot(TM) 64-Bit Server VM 1.8.0_45-b14  
    Docs: (doc function-name-here)
          (find-doc "part-of-name-here")
  Source: (source function-name-here)
 Javadoc: (javadoc java-object-or-class-here)
    Exit: Control+D or (exit) or (quit)
 Results: Stored in vars *1, *2, *3, an exception in *e

ted.core=> (pprint (sql/artists db {:cols ["name"]}))  
({:name "Pearl Jam"}
 {:name "Dead & Company (2 Sets)"}
 {:name "LCD Soundsystem"}
 {:name "J. Cole"}
 {:name "Ellie Goulding"}
 {:name "Macklemore & Ryan Lewis"}
 {:name "Tame Impala (Late Night)"}
 {:name "Death Cab for Cutie"}
 ... elided ...
 {:name "Roman GianArthur Irvin"}
 {:name "LANY"}
 {:name "Con Brio"}
 {:name "Flux Capacitor"})
nil

This first function sql/artists took a map db of PostgreSQL connection data plus a map of options which used the :cols key to pick the columns for the query. In SQL terms this is a projection of "name" from Artists. Note that it returned this as a list of maps with the :name keyword. Already usable!

Next let's look at something a little more complicated:

ted.core=> (pprint (sql/artists-by-name-like db {:name-like "L%"}))  
({:name "LCD Soundsystem",
  :mb_id #uuid "2aaf7396-6ab8-40f3-9776-a41c42c8e26b",
  :date_formed #inst "2001-12-31T06:00:00.000-00:00"}
 {:name "Leon Bridges",
  :mb_id #uuid "69d9dfd7-19b7-4a75-8a53-9f733fb5d774",
  :date_formed #inst "1989-07-13T05:00:00.000-00:00"}
 {:name "Lamb of God",
  :mb_id #uuid "298909e4-ebcb-47b8-95e9-cc53b087fc65",
  :date_formed #inst "1998-12-31T06:00:00.000-00:00"}
 {:name "Lord Huron",
  :mb_id #uuid "51c76d97-0635-433c-b26d-66beb81657f8",
  :date_formed #inst "2030-12-31T06:00:00.000-00:00"}
 {:name "Lucius",
  :mb_id #uuid "9e233b0e-9e9d-4865-9d66-18fbcaf02bd8",
  :date_formed #inst "1996-02-29T06:00:00.000-00:00"}
 {:name "Lettuce",
  :mb_id #uuid "e88313e2-22f6-4f6d-9656-6d2ad20ea415",
  :date_formed #inst "1992-12-31T06:00:00.000-00:00"}
 {:name "Lizzo",
  :mb_id #uuid "8fb5370b-9568-4b61-9da5-2aa12c9928db",
  :date_formed #inst "1988-04-27T05:00:00.000-00:00"}
 {:name "Lolawolf",
  :mb_id #uuid "a6307c9f-69c9-4bc7-b59b-dff485376a0a",
  :date_formed #inst "2030-12-31T06:00:00.000-00:00"}
 {:name "LANY",
  :mb_id #uuid "f1cc066a-6dd1-4b18-ae68-3708472ab4b1",
  :date_formed #inst "2030-12-31T06:00:00.000-00:00"})
nil

Once again we get a list of maps which are very easy to work with especially in Clojure. artists-by-name-like is just another straightforward function wrapping SQL. Look at the original SQL annotation:

-- :name artists-by-name-like :? :*
SELECT *  
  FROM artists
 WHERE name 
  LIKE :name-like

and compare it to the function call:

(sql/artists-by-name-like db {:name-like "L%"})

Then look again at its output above. Pretty easy.

How about something even more complicated? What about INSERT and transactions? Let's look. First the SQL:

-- :name new-rating :! :n
-- :doc insert new rating
INSERT INTO ratings (artist_name, email, rating_type_name, rating)  
VALUES (:artist_name, :email, :rating_type_name, :rating)

-- :name new-rating-type :! :n
-- :doc insert new rating-type
INSERT INTO rating_types (name, description)  
VALUES (:name, :description)

Then from the REPL:

ted.core=> (clojure.java.jdbc/with-db-transaction [tx db]  
      #_=>   (sql/new-rating-type tx {:name "hays ratings" :description "ratings by hays"})
      #_=>   (sql/new-rating tx {:email "hays@compose.io" :rating 5 :artist_name "Pearl Jam" :rating_type_name "hays ratings"})
      #_=> )
1

Nicely done. The functions took tx instead of a db like before. So, you can use transactions if you need to (there are connections and pooling and snippets too).

Now that we have you started you can check out HugSQL's docs to get an even more in depth look at things like DDL, connection pooling, and other functionality that may be useful in a project of yours. And, as with all SQL generation tools you should always me mindful of SQL injection attacks. Do be sure to at least look at this at some point too.

Better to Have the One Hundred Functions with One Data Structure

"It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures." —Alan Perlis

With HugSQL it is easy to manage your application's data with just a few very common structures like lists and maps. It is also easy to generate a bunch of functions that integrate with your database. HugSQL abstracts at an interesting and sensible layer of your code. It is pragmatic and functional in a very Clojure way. Not too big so it doesn't overload your thinking. Not too small so it actually provides a lot of functionality. If you are already in the Clojure world then adding PostgreSQL with HugSQL to your toolkit is a good choice. Cheers.