From SQL to Scylla - User and Session Management

Published

In this Write Stuff article, Kirk Morales looks at how SQL database techniques can translate over to NoSQL databases like Scylla, starting with user and session management.

The modern advantages of NoSQL databases like ScyllaDB often come with few learning curves and a change in mentality. The basic, widely-used concepts of product architecture remain the same but need to be approached with a new conceptual mindset.

User and session management is the cornerstone of every product. Without it, the application has no idea who's performing what tasks, what roles are to be enforced, and what actions are permitted as a result. Products must be secured and proper user management is the guardian.

Here, we'll look at the comparisons between how basic user management in Scylla differs from the traditional approach in SQL.

Start with the schema

Let's start with the user's table first. This is the simplest as the base tables in both SQL and Scylla are very similar:

SQL

CREATE TABLE users (  
  user_id MEDIUMINT NOT NULL AUTO_INCREMENT,
  email varchar(255),
  password_hash varchar(255),
  PRIMARY KEY (user_id)
);

We create a table that stores a user's ID, email address (which they'll use when logging in), and a hash of their password. The user ID is auto-generated when we do a new insert and is the primary key of the table:

INSERT INTO users (email, password_hash)  
VALUES ('john@doe.com', '5f4dcc3b5aa765d61d8327deb882cf99');  

ScyllaDB

CREATE TABLE users (  
  user_id uuid,
  email text,
  password_hash text,
  PRIMARY KEY (user_id)
);

You'll notice this looks largely similar to the SQL version, except for the different data types. Also, ScyllaDB doesn't support auto-increment columns. Instead, we store the user_id as a uuid, which we supply when we create the entry:

INSERT INTO users (user_id, email, password_hash)  
VALUES (74d6b8a0-fc78-4815-acda-7a0eb076b617, 'john@doe.com', '5f4dcc3b5aa765d61d8327deb882cf99');  

Having our primary key on the user_id column means we must provide it in every query to the table. We'll address how to handle email lookups in a bit.

User lookup

When a user logs in with their email and password, we first need to do a simple lookup to make sure the password provided matches the email address. If it does, the user has successfully authenticated. This works fine in SQL, where we can do a single query to validate:

SELECT user_id, password_hash FROM users WHERE email = 'john@doe.com';  

If we get a result, we can test the password hash and, if it matches, we now have the ID of the user that successfully authenticated.

Validate a user with ScyllaDB

With ScyllaDB, this lookup gets a bit more complex. Since our primary key is on the user_id, we can't do a lookup on that table without it. We need a way to look up users by their email address.

Why not make email the primary key? This may be your knee-jerk thought - why not just lose the user_id altogether, key off of the email, and use that to do the lookup? Sure, this would work fine, however it compromises the long-term maintainability of our data. As we begin to reference this user across other tables, we would then rely on mutable data (the email) as keys. With Scylla, we can not change the primary key of a row once created. So, if the email changed, we'd have to re-write any rows across the entire database where it's used. Add in the user_id, a static value representing the same user, even if their email changes, solves this.

Create email lookup table

To solve this email lookup issue, we simply create a separate table that allows us to look up a user_id by an email:

CREATE TABLE emails (  
  email text PRIMARY KEY,
  user_id uuid
);

When we create a new user, we now make two inserts:

INSERT INTO users (user_id, email, password_hash)  
VALUES (74d6b8a0-fc78-4815-acda-7a0eb076b617, 'john@doe.com', '5f4dcc3b5aa765d61d8327deb882cf99');

INSERT INTO emails (email, user_id)  
VALUES ('john@doe.com', 74d6b8a0-fc78-4815-acda-7a0eb076b617);  

To validate a user, we make two reads. The first lookups up the user_id by email:

SELECT user_id FROM users where email = 'john@doe.com';  

Once we have that, we can get the password and validate it:

SELECT password_hash from users where user_id = 74d6b8a0-fc78-4815-acda-7a0eb076b617;  

Now, you may be thinking that this is terribly inefficient, however with high availability and low latency, Scylla is designed to handle these types of well-indexed requests quickly.

An alternative Scylla schema for email lookups

If you'd prefer not to make two queries for authentication, we can easily reduce this to one by storing the password redundantly in the emails table. In this case, we would simply update both the users and emails table each time either a user's email or password changes and our emails table would look like this:

CREATE TABLE emails (  
  email text PRIMARY KEY,
  user_id uuid,
  password_hash text
);

Tracking user's sessions

Once a user is validated, we may need to track information about their session, such as the expiration time, browser details, etc. To do this, let's define a basic sessions table to store this data in.

CREATE TABLE sessions (  
  session_id uuid,
  user_id uuid,
  expires timestamp,
  PRIMARY KEY (session_id, user_id)
);

Here, we create a compound primary key on both session_id and user_id, requiring both as an extra measure of protection so that would-be assassins can't try to get session data without knowing the if the associated user and vice versa.

After validating a user, we assign a session_id, create a record in the sessions table, and return both the user_id and session_id to the client, likely using some form of signed value that can be validated itself later. In subsequent requests, after validating that the user_id and session_id are trusted, we can make a request from the sessions table to get the session data:

SELECT * FROM sessions WHERE session_id = SESSION_ID and user_id = USER_ID;  

Contrasting with SQL

The method for session management in this way is not so different with SQL. We'd still need a separate sessions table, would need to have that session ID and the user ID pass between the client and server, and would make a single read with each request to get session data. With Scylla, however, the average latency of our reads are likely to be faster with higher availability.

Wrap up

When designing a schema in Scylla, often a good place to start is with how the schema would be designed for SQL. From there, you can determine what data types need to be changed and what lookups may need to be performed on non-primary key data.

Although there are some trade-offs with using Scylla for basic programming concepts like user management, there are many pros to help outweigh the possible inefficiencies:

Kirk is CEO at [Hyver](https://www.hyverlabs.com) and a long-time SaaS engineer who loves working with startups, the outdoors, and relaxing family/dog time.


This article has been published as part of Compose's Write Stuff program - if you want to write about databases or share your experiences with database technology, we invite you to participate.

attribution Joe Hernandez

This article is licensed with CC-BY-NC-SA 4.0 by Compose.

Conquer the Data Layer

Spend your time developing apps, not managing databases.