Python and Compose - MongoDB, Elasticsearch and PostgreSQL

Published

Hello, and welcome to the Python leg of the Compose Grand Tour. The purpose of the tour is to provide the best examples of connecting to all of the databases that are available on Compose, in a wide range of languages. In this first part, we cover Pythonic connections to MongoDB, Elasticsearch, and PostgreSQL.

At the core of the Grand Tour is an example web application written in JavaScript with the backend persistent data being handled by a Compose database and the featured language, in this case Python.

If you wish to familiarize yourself with the example the code, it is presented in the example folder of the Compose Grand Tour/Python repository without any database connections. This standalone version was created to allow you to see what will be common code with all the examples, letting you focus on the database connection code in later examples. Where possible, all the database connections will be made in the best possible way. That means setting up secure connections using TLS/SSL, certificate use, and server verification where possible.

As this is the first of three articles, we'll have a brief overview of the app, and then cover connecting in Python to MongoDB, Elasticsearch, and PostgreSQL. Let's start this part of the tour by looking at the Python example app.

The Example App

The example is a web application where users can enter in word/description pairs and see previously entered pairs. The frontend is client-side Javascript, and these examples provide the server-side code written in Python.

The application is structured as follows:

These Python examples use Flask to provide the server framework. All of the other information to get connected to the Compose databases are provided in the Connection Info of a deployment's overview page. Now, it's time to move on to the databases!

MongoDB

The Code: The full example code is in the mongodb-example folder of the Compose Grand Tour/Python repository.

The Driver: Pymongo is the MongoDB recommended Python driver.

The Certificate: The MongoDB example assumes that the deployment was created with SSL/TLS-enabled connections. On Compose those are backed with a self-signed certificate. To use certificate verification, save a copy of the certificate locally, and keep it's path handy. The certificate is on the Overview page of the deployment, and the path to the local copy should go in an environment variable PATH_TO_MONGODB_CERT.

The Connection: Pymongo supports parsing your connection information directly from a supplied URL endpoint. Copy the connection string containing both endpoints from the Connection Info section of the deployment and put it in an environment variable COMPOSE_MONGODB_URL. After pulling the connection string and the certificate information from the environment, the MongoClient will establish a connection to the deployment. For the purpose of this example (and the MongoDB grand tour examples in other languages) we connect to (or create) a database named 'grand_tour' and a 'words' collection in that database.

# connection string and initialization
mongodb_url = os.environ['COMPOSE_MONGODB_URL']  
path_to_cert = os.environ['PATH_TO_MONGODB_CERT']  
client = MongoClient(  
    mongodb_url, 
    ssl=True, 
    ssl_ca_certs=path_to_cert
)

# database/collection names
db = client.grand_tour  
collection = db.words  

The Read: Given that MongoDB stores documents, requesting all the stored words and displaying them only takes a small bit of code.
Using find(), we can get back all the documents in the 'words' collection in a bson object. Pymongo includes a library to convert bson data into something more pythonic, but using its json.util here will give back a JSON object, which is what the frontend expects to display.

# query for all the words in the collection, returns as json for display on the page.
def display_find():  
    cursor_obj = collection.find({}, {"_id":0})
    return json_util.dumps(cursor_obj)

The Write: Adding a new word to the MongoDB 'words' collection is an equally simple process. The new entry is created by pulling the contents of the text fields on the page, placing them into an object to be stored as a document, and inserting the document into the collection.

# triggers on hitting the 'Add' button; inserts word/definition into collection
def handle_words():  
    new_word = {"word":request.form['word'], "definition":request.form['definition']}
    doc_id = collection.insert_one(new_word).inserted_id
    return ('', 204)

The MongoDB case is probably the simplest for both adding and retrieving the words to the database. On to Elasticsearch!

Elasticsearch

The Code: The full example code is in the
elasticsearch-example folder of the Compose Grand Tour/Python repository.

The Driver: The official Python driver for Elasticsearch is elasticsearch-py.

The Certificate: Compose Elasticsearch deployments use SSL/TLS-enabled connections that are backed with a Let's Encrypt certificate. Certificate verification will use your local machine's certificate store.

The Connection: Set one of the connection strings from your deployment, including the authentication information, into the COMPOSE_ELASTICSEARCH_URL environment variable. The Elasticsearch driver does not automatically parse the connection string that Compose provides, but the urlparse function from the urllib library will break it apart for us from the environment variable. The pieces can then be supplied to the driver. There are two SSL parameters to set, use_ssl and verify_certs, which, for best practice, are set to True.

The last piece of the connection checks to see if the 'words' index already exists, and if it does not, creates it. If you explore other grand tour language examples, they will also use this 'words' index in Elasticsearch.

# connection string and connection initialization
compose_elasticsearch_url = os.environ['COMPOSE_ELASTICSEARCH_URL']  
parsed = urlparse(compose_elasticsearch_url)

es = Elasticsearch(  
    [parsed.hostname],
    http_auth=(parsed.username, parsed.password),
    port=parsed.port,
    use_ssl=True,
    verify_certs=True
)

# index access/creation
if not es.indices.exists(index="words"):  
    es.indices.create(index="words")

The Read: When searching an index in Elasticsearch, the resulting JSON document contains a bunch of meta-data that isn't needed for the display on the web page. The search result is parsed down to word/definition pairs. Each result matching the search is a "hit". The actual word/definition object is then embedded in the "_source" field of each hit, so the "word" object from "_source" is appended to a list. The list is then returned as a JSON object using the Python json library.

def display_search():  
    # Run a search for all existing words
    res = es.search(index="words", doc_type="word", body={})

    # Pull the word object from each hit in the search results
    hit_list = (res['hits']['hits'])

    # List word objects, appending the contents from the search hit's _source field.
    words_list = []
    for hit in hit_list:
        words_list.append(hit['_source'])

    # JSON-ify the list of words.
    return json.dumps(words_list)

The Write: The function to add new words makes an object from the contents of the app's text fields. Then it inserts the object as a document of type "word" in the "words" index. The last bit of code refreshes the index so the word is available for the next display call.

def handle_words():  
    new_word = {"word":request.form['word'], "definition":request.form['definition']}

    # Insert the new word into the index
    es.index(index="words", doc_type="word", body=new_word)

    # Refresh index.
    es.indices.refresh(index="words")
    return ('', 204)

Having to parse search results will be a recurring theme in many of the examples, including our next, and final stop in this article: PostgreSQL.

PostgreSQL

The Code: The full example code is in the postgresql-example folder of the Compose Grand Tour/Python repository.

The Driver: psycopg2 is the most popular Python driver for PostgreSQL.

The Certificate: PostgreSQL deployments on Compose come with SSL/TLS enabled and are backed with self-signed certificates. To use certificate verification, save a copy of the certificate locally, and keep it's path handy. The certificate is on the Overview page of the deployment, and the path to the local copy should go in an environment variable PATH_TO_POSTGRESQL_CERT.

The Connection: To get the information the psycopg2 needs to connect, copy one of the connection strings with authentication from the Connection Info section of your PostgreSQL deployment and put it in an environment variable COMPOSE_POSTGRESQL_URL. Like the previous example, the connection string is parsed and the relevant pieces passed to the driver. To ensure use of SSL and the certificate information, sslrootcert is given the path to the certificate (set in the environment) and sslmode is set "require".

The final piece is specifying the the database to use and again, for the purpose of this example (and the PostgreSQL grand tour examples in other languages) we connect to a database named 'grand_tour'. Included here is also the code for top-level page display, which contains the SQL to create the the "words" table (should it not already exist), with appropriate columns for word, definition, and an id.

# connection string and db connection initialization
compose_postgresql_url = os.environ['COMPOSE_POSTGRESQL_URL']  
path_to_postgresql_cert = os.environ['PATH_TO_POSTGRESQL_CERT']  
parsed = urlparse(compose_postgresql_url)

conn = psycopg2.connect(  
    host=parsed.hostname,
    port=parsed.port,
    user=parsed.username,
    password=parsed.password,
    sslmode='require',
    sslrootcert=path_to_postgresql_cert,
    database='grand_tour')

@app.route('/')
# top-level page display, creates table if it doesn't exist
def serve_page():  
    cur = conn.cursor()
    cur.execute("""CREATE TABLE IF NOT EXISTS words (
        id serial primary key,
        word varchar(256) NOT NULL,
        definition varchar(256) NOT NULL) """)
    return render_template('index.html')    

The Read: PostgreSQL being a traditional RDMS, the words are being stored in a table, using a 'word' column and a 'definition' column to relate the two by row. This is handy to keep in mind as we parse and aggregate the results into JSON for our app.

To display all the words currently stored in the database, first select the word and definition columns form the table and then use fetchall() to hang onto them. Next, as part of the information that the cursor object stores when we execute the query on the words table, the column names "word" and "definition" are accessible from the object's 'description' attribute.

To make the list that the application uses to display all the current entries, zip together each row with the column headings to form a python dict, add it to the list, and then JSON-ify and return the whole list.

def display_select():  
    cur = conn.cursor()

    # SQL query for all the rows in the table, stores rows in an object
    cur.execute("""SELECT word, definition FROM words""")
    cursor_obj = cur.fetchall()

    # grabs column names from the table
    labels = [column[0] for column in cur.description]

    # makes a list from the dict of the zip of column names and the results object
    results_list = []
    for row in cursor_obj:
        results_list.append(dict(zip(labels, row)))

    # makes json from the list for display on the page
    return json.dumps(results_list)

The Write: The function to add new words to the table is a pretty straightforward INSERT of the values taken from the text fields on the web form. The transaction is then committed to the database.

def handle_words():  
    cur = conn.cursor()

    # SQL query to insert page form values
    cur.execute("""INSERT INTO words (word, definition)
        VALUES (%s, %s)""", (request.form['word'], request.form['definition']))
    conn.commit()
    return ('', 204)

Part 1 Wrap-up

That concludes the first part of the Grand Tour in Python, featuring connections (and query result parsing!) for MongoDB, Elasticsearch, and PostgreSQL. In Part 2 of the tour, we cover connections to Redis, RethinkDB, and RabbitMQ. That is then followed by the third and final part covering connections for Scylla and Compose for MYSQL.

attribution StockSnap

Conquer the Data Layer

Spend your time developing apps, not managing databases.