In this Write Stuff article, Gareth Dwyer writes about using SQLAlchemy, a Python SQL toolkit and ORM, discussing the advantages of using it while performing database operations. He shows the differences between using raw SQL and using an ORM, and gives examples that perform CRUD operations on a PostgreSQL database.
If you build a database application using Python, you have a few options on how to interact with your database from your Python code. Many developers who are already familiar with Python and SQL will simply install a thin Python wrapper to the database driver and write all the database logic using Python strings to execute raw SQL against the database. This article is aimed at those who have some experience with Python and SQL, but who are not familiar with SQLAlchemy.
SQLAlchemy offers several benefits over the raw SQL approach, including:
- Cleaner code: Having SQL code as Python strings gets messy pretty quickly,
- More secure code: Using SQLAlchemy's ORM functionalities can help mitigate against vulnerabilities such as SQL injection,
- Simpler logic: SQLAlchemy allows us to abstract all of our database logic into Python objects. Instead of having to think on a table, row, and column level, we can consider everything on a class, instance, and attribute level.
Getting started with using SQLAlchemy can seem pretty daunting, as a lot of the documentation and tutorials assume a fairly high level of familiarity with what SQLAlchemy is, what it does, and how it works. In this article, I'll show how you can use SQLAlchemy at three different layers of abstraction. The lowest layer is using only SQLAlchemy's engine component to execute raw SQL. The middle layer is using SQLAlchemy's expression language to build SQL statements in a more Pythonic way than using raw SQL strings. The highest extraction layer is using SQLAlchemy's full Object Relational Mapping (ORM) capabilities which allows one to think in terms of Python classes and objects instead of database tables and connections.
In this article, we'll:
- give a brief introduction to SQLAlchemy and outline the three layers of abstraction at which it can be used,
- give practical examples of Create, Read, Update, and Delete (CRUD) operations for each layer of abstraction
- present some discussion on when each of the abstraction layers might be most suitable.
The example code uses Python 3.5 and SQLAlchemy 1.1.3, but it should be almost identical for other versions of Python and SQLAlchemy.
SQLAlchemy is best understood by thinking of it as two distinct components, SQLAlchemy Core and SQLAlchemy ORM. The latter is built on top of the former, but you can use either component exclusively to build your application, depending on which level of abstraction you prefer. Let's take a quick look at some comparable insert statements, using raw SQL, the SQLAlchemy Expression Language, and the SQLAlchemy ORM.
To insert a new film into a film database, you might write a raw SQL statement that looks as follows:
statement = "INSERT INTO films (title, director, year) VALUES ('Doctor Strange', 'Scott Derrickson', '2016')" conn.execute(statement)
Using the SQLAlchemy expression language, we would create the same statement using more Pythonic code that looks similar to this (we'll discuss later exactly what the
films variable is doing here and how to set it up).
statement = films.insert().values(title="Doctor Strange", director="Scott Derrickson", year="2016") conn.execute(statement)
Once you get used to this expression language, it's a lot cleaner than the raw SQL. We don't have to worry about apostrophes and string formatting so much, and we can probably start using more of our IDE's code linting and autocompletion functions. But we're still essentially thinking about our database logic on a table, column, and row level —
films is still our table, which we call
insert() on, and pass values for each column in order to form a new row.
If we use the full ORM from SQLAlchemy, we'll get to the highest level of abstraction, and instead of creating an insert statement to add a new film to our database, we could do everything on a Python Object level, and our code would look something like:
doctor_strange = Film("Doctor Strange", "Scott Derrickson", "2016") db_session.add(doctor_strange)
In this example, we simply create a Film object, and "add" it to our database session. The last example has even more going on in the background that we haven't shown here: we need to set up a
Film class and the
db_session correctly for this to work. Again, we'll show this in more detail below, but the important thing to understand for now is that these are three distinct ways to use SQLAlchemy, and when building a database application you'll need to choose which of the three best suits your needs.
Create, Read, Update, and Delete using Raw SQL
In this section, we'll show how to run each of the CRUD operations against PostgreSQL through SQLAlchemy, using only raw SQL statements. Using this method, we get none of the advantages used above, and we may as well use only a basic Python database driver (e.g.
psycopg, which we need to install to use SQLAlchemy with PostgreSQL in any case).
The following assumes that you have access to a PostgreSQL database and that you have installed both the
psycopg2 python packages (both available through pip).
from sqlalchemy import create_engine db_string = "postgres://admin:firstname.lastname@example.org:15813/compose" db = create_engine(db_string) # Create db.execute("CREATE TABLE IF NOT EXISTS films (title text, director text, year text)") db.execute("INSERT INTO films (title, director, year) VALUES ('Doctor Strange', 'Scott Derrickson', '2016')") # Read result_set = db.execute("SELECT * FROM films") for r in result_set: print(r) # Update db.execute("UPDATE films SET title='Some2016Film' WHERE year='2016'") # Delete db.execute("DELETE FROM films WHERE year='2016'")
The above code assumes that you have already created a database (called
test) and that you have a database user with a password set up. It shows the simplest possible examples of the four main database operations (create, read, update, and delete). We create a table and insert the film then we get the data we just inserted. After that we modify the data, and, finally, we delete it. Even with such a simple case, our database code is already getting messy. The strings are long, and we'd have issues if we needed to insert double or single quotation marks in any of the fields, as we are already using both (double for the Python strings and single for the SQL strings). This code would become even messier if we were to start using dynamic values, if we needed to run more complicated queries, or if we were dealing with anything more complicated than our single-table toy database. Let's take a look at the same example using the SQLAlchemy's SQL Expression Language.
Create, Read, Update, and Delete using the SQL Expression Language
In this section, we'll show how to achieve exactly the same as above, but using the SQL Expression Language that SQLAlchemy provides, instead of using raw SQL strings. We'll see that this code is a little bit more verbose, and also fairly complicated and difficult to read, so we don't gain that much over the raw SQL queries above. But we can see that we're already starting to get closer to interacting with our database in a pythonic way.
from sqlalchemy import create_engine from sqlalchemy import Table, Column, String, MetaData db_string = "postgres://admin:email@example.com:15813/compose" db = create_engine(db_string) meta = MetaData(db) film_table = Table('films', meta, Column('title', String), Column('director', String), Column('year', String)) with db.connect() as conn: # Create film_table.create() insert_statement = film_table.insert().values(title="Doctor Strange", director="Scott Derrickson", year="2016") conn.execute(insert_statement) # Read select_statement = film_table.select() result_set = conn.execute(select_statement) for r in result_set: print(r) # Update update_statement = film_table.update().where(film_table.c.year=="2016").values(title = "Some2016Film") conn.execute(update_statement) # Delete delete_statement = film_table.delete().where(film_table.c.year == "2016") conn.execute(delete_statement)
Note that we've had to add a few more imports at the top of this snippet to allow us to talk about the same concepts as before (Table, Column and String) using Python instead of SQL. The biggest change is creating a Table as a Python class, which takes a variable number of Column objects as arguments, and then calling
.create() on this to actually create the table in Postgres.
Our four CRUD operations are a direct parallel of what we did before, but instead of writing the statement as a long string, we can chain together various functions such as
values() that are provided by SQLAlchemy's SQL Expression Language.
Create, Read, Update, and Delete using the SQL ORM
In this last section, we'll see how to do the same thing as in the previous examples using the full ORM. Our code is now longer again in terms of lines, but it feels more concise in terms of line length, and it is more readable (once you're used to the concepts behind the ORM). At this point, we can largely ignore database concepts such as tables, and think only in terms of Python objects.
from sqlalchemy import create_engine from sqlalchemy import Column, String from sqlalchemy.ext.declarative import declarative_base from sqlalchemy.orm import sessionmaker db_string = "postgres://admin:firstname.lastname@example.org:15813/compose" db = create_engine(db_string) base = declarative_base() class Film(base): __tablename__ = 'films' title = Column(String, primary_key=True) director = Column(String) year = Column(String) Session = sessionmaker(db) session = Session() base.metadata.create_all(db) # Create doctor_strange = Film(title="Doctor Strange", director="Scott Derrickson", year="2016") session.add(doctor_strange) session.commit() # Read films = session.query(Film) for film in films: print(film.title) # Update doctor_strange.title = "Some2016Film" session.commit() # Delete session.delete(doctor_strange) session.commit()
The first thing to note is the change in imports again. We no longer need to import the
Table class, but instead we import
sessionmaker. Instead of creating tables, we'll create Python classes that subclass
declarative_base, and instead of making a connection to our database we'll ask for a session. Both of these concepts are a higher layer of abstraction than the ones we used previously.
As before we still create a database engine, and now we also instantiate a
declarative_base. Instead of defining our Film class as a
Table, we create a normal Python object which subclasses
base and which defines
__tablename__. As before, we define the columns and column types for our Film object, but now we can use the attributes
year, instead of using the strings that we had before.
Sessions have some complications and subtleties that we won't cover in detail here. You can think of a session as an intelligent connection to our database that will watch what we're doing in our Python code and modify the database as necessary. Sessions have one more layer of abstraction than you would expect — first we have to instantiate a
sessionmaker and link it to our engine (
Session = sessionmaker(db)) and then we have to instantiate this to open an actual session (
session = Session()). It's well worth reading all the nitty-gritty details of how sessions work in the official documentation.
declarative_base also provides some shortcuts — For example, we create our table by calling
base.metadata.create_all(), which examines the schema that we implicitly created by declaring the Film class, and sends a
CREATE TABLE command to our database.
We can insert data into our database by instantiating Python objects and passing them to
session.add(). In our earlier examples, we used a database connection which defaults to auto commit. Sessions do not have this default, so we need to call
session.commit() after any changes to our database.
Perhaps the starkest difference is in our update example. Here we don't pass anything to the session at all — we simply modify our Python object, and then call
session.commit(), which notices the modification and makes the update call to the database.
We looked at three different ways to use SQLAlchemy in this article and showed the basic database operations for each possibility. The final thing to discuss is how to choose between the three ways of using SQLAlchemy.
You probably never want to use the first method (using raw SQL strings), unless you are writing a throwaway script that you need to run only once and are already using SQLAlchemy for the rest of your code base. This method can also be useful if you have existing complex SQL code that you need to run. However, usually, if you want to run raw SQL from Python, you can use a simpler Python database driver such as
psycopg2 directly, without the need to install SQLAlchemy as well.
The second method (SQL Expression Language) can be useful if you need to do something unusual or to run some SQL code that is specific to a certain database. Because the ORM layer tries to abstract away completely from which database you're using, it is sometimes not possible to use it to run SQL code that is highly customized and which targets non-general features of a specific backend. The Expression Language is designed to map to primitive constructs of each backend, and can, therefore, be useful in advanced scenarios.
The final method we showed (The ORM) is useful to keep your code Pythonic and to abstract away from the database completely. It allows one to deal directly with Python classes instead of Tables, with instantiations of objects instead of rows, and with object attributes instead of columns. It also abstracts away from the specific database flavor that you're using, and thus switching your codebase to a different backend can be as simple as changing the call to
create_engine, if you've only used functionality that is general to all RDBMSs. Using the ORM for general database applications that do not require specializations provided by only a subset of RDBMSs is thus highly advantageous and you should strongly consider using it wherever possible.
attribution Brianna Fairhurst