Document Validation in MongoDB By Example

Published

In this article, we'll explore MongoDB document validation by example using an invoice application for a fictitious cookie company. We'll look at some of the different types of validation available in MongoDB, and provide a practical working example of validations in action.

Document validation was introduced in MongoDB 3.2 and defines a new way for developers to control the type of data being inserted into their MongoDB instances. We like to show rather than tell so we'll use a practical example to demonstrate basic validations and the commands used to add them to MongoDB.

Document Validation in a Nutshell

Document databases are a flexible alternative to the pre-defined schemas of relational databases. Each document in a collection can have a unique set of fields, and those fields can be added or removed from documents once they are inserted which makes document databases, and MongoDB in particular, an excellent way to prototype applications. However, this flexibility is not without cost and the most underestimated cost is that of predictability.

Since the data fields stored in a document can be changed for each document in a collection, developers lose the ability to make assumptions about the data stored in a collection. This can have major implications in your applications - if a transaction in a finance application was inserted with the wrong fields it could throw off calculations and reports that are vital to business.

Developers accustomed to relational databases recognize the importance of predictability in data formats, and that's one of the reasons that validation was introduced in MongoDB 3.2. Let's see how document validation works by making an application that uses it.

Creating the Data Models

As a demonstration by example, we're going to create a fictitious cookie company. We'll use this as an example since the entities in this kind of business can be generalized to apply to other businesses. In this case, we'll simply to 3 main data entities:

  1. A Customer, which represents a person making a purchase
  2. A Product, which represents an item being sold
  3. A Transaction, which represents the purchase of a number of products by a customer.

Since this is a trivial example, let's build these out in minimal form. In practice, you can make your data entities as complex as you need.

Customer

The Customer entity represents someone making a purchase, so we'll include some data typically found in a Customer entity. A typical customer entity might look like the following:

{
  "id": "1",
  "firstName": "Jane",
  "lastName": "Doe",
  "phoneNumber": "555-555-1212",
  "email": "Jane.Doe@compose.io"
}

Once we know what properties we'll want to include, we'll need to determine what types of validations we'd like to do on this entity.

The first step in adding validation is to figure out exactly what we'd like to validate. We can validate any of the fields in a collection and can validate based on the existence of a field, data type and format in that field, values in a field, and correlations between two fields in a document.

In the case of the Customer entity, we'd like to validate the following:

We can represent these validations in an intermediate format (before putting them into the database) using the JSONSchema spec. While JSONSchema isn't a necessary step to do validations in MongoDB, it's helpful to codifying our rules in a standard format and JSONSchema is quickly gaining traction for doing server-side validations.

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "type": "object",
  "properties": {
    "id": {
      "type": "string"
    },
    "firstName": {
      "type": "string"
    },
    "lastName": {
      "type": "string"
    },
    "phoneNumber": {
      "type": "string",
      "pattern": "^([0-9]{3}-[0-9]{3}-[0-9]{4}$"
    },
    "email": {
      "type": "string"
    }
  },
  "required": [
    "id",
    "firstName",
    "lastName",
    "phoneNumber",
    "email"
  ]
}

Using JSONSchema also allows us to re-use validations on the application side as well, such as RESTHeart's JSONSchema validation.

Product

Just as we did above with the Customer entity, let's take a look at what an example Product entity might contain:

{
  "id": "1",
  "name": "Chocolate Chip Cookie",
  "listPrice": 2.99,
  "sku": 555555555,
  "productId": "123abc"
}

We'll also codify our validations in JSONSchema format as well:

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "type": "object",
  "properties": {
    "id": {
      "type": "string"
    },
    "name": {
      "type": "string"
    },
    "listPrice": {
      "type": "number"
    },
    "sku": {
      "type": "integer"
    },
    "productId": {
      "type": "string"
    }
  },
  "required": [
    "id",
    "name",
    "listPrice",
    "sku",
    "productId"
  ]
}

Transaction

The last entity we'll use in our fictitious cookie shop is a transaction. A transaction represents a single purchase of one or more products by one customer (many-to-one relationship). An inserted transaction record might look like the following:

{
  "id": "1",
  "productId": "1",
  "customerId": "1",
  "amount": 20.00
}

Lastly, we'll codify the validations we want in JSONSchema format:

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "type": "object",
  "properties": {
    "id": {
      "type": "string"
    },
    "productId": {
      "type": "string"
    },
    "customerId": {
      "type": "string"
    },
    "amount": {
      "type": "number"
    }
  },
  "required": [
    "id",
    "productId",
    "customerId",
    "amount"
  ]
}

Now that we have the structure and validation rules for our application, let's add these validation rules to our Mongo database.

Adding Validation Rules

Now that we have an idea of how we want to validate our data, let's add those validation rules to a MongoDB collection. First, let's spin up a new MongoDB on Compose deployment and create a new database for your cookie shop. Be sure to add a database user so we can connect to the database after this step. We'll create a new collection using the mongo command line application, which you can install for your platform.

Once you've installed the mongo command line application, created a new database, and added a database user, it's time to create your collection through the mongo command line tool. Open a terminal and type the following:

mongo mongodb://dbuser:secret@aws-us-east-1-portal.8.dblayer.com:15234/cookieshop  

This will load up the interactive mongo shell. Now, let's create our collections in the database with the validations we determined earlier. We'll start with the Customer collection:

> db.createCollection("customers", {
  validator: {
    $and: [
      {
        "firstName": {$type: "string", $exists: true}
      },
      {
        "lastName": { $type: "string", $exists: true}      
      },
      {
        "phoneNumber": { 
          $type: "string", 
          $exists: true,
          $regex: /^[0-9]{3}-[0-9]{3}-[0-9]{4}$/
        }
      },
      {
        "email": {
          $type: "string",
          $exists: true
        }
      }
    ]
  }
})

We'll leave email validation alone for now since it can be a bit complicated for a trivial example. Next, let's add our products collection and validations:

> db.createCollection("products", {
  validator: {
    $and: [
      {
        "name": {$type: "string", $exists: true}
      },
      {
        "listPrice": { $type: "double", $exists: true}      
      },
      {
        "sku": { $type: "int", $exists: true}
      }
    ]
  }
})

Finally, we'll add our transactions collection which contains a reference to documents in the products and customers collections:

db.createCollection("transactions", {  
  validator: {
    $and: [
      {
        "productId": {$type: "objectId", $exists: true}
      },
      {
        "customerId": { $type: "objectId", $exists: true}      
      },
      {
        "amount": { $type: "double", $exists: true}
      }
    ]
  }
})

The objectId type is a special type that allows us to reference documents from other collections. In our case, we'll use it to associate a specific product and user in a transaction.

Testing Validations

Now, it's time to test our validations to make sure they worked out. We'll start by adding a new customer:

db.customers.insertOne({  
  firstName: "John",
  lastName: "O'Connor",
  phoneNumber: "555-555-1212"
});

Notice that we've omitted the email field from our user, which was marked as required when we set up our validations. If we set up the validations correctly, we'd expect the insertion to fail which it does:

2017-02-09T12:45:36.714-0800 E QUERY    [thread1] uncaught exception: WriteError({  
    "index" : 0,
    "code" : 121,
    "errmsg" : "Document failed validation",
    "op" : {
        "_id" : ObjectId("589cd4f06ca2fef0f7737fb9"),
        "firstName" : "John",
        "lastName" : "O'Connor",
        "phoneNumber" : "555-555-1212"
    }
}) :
undefined  

Once we add the email field to the customer, the validation passes and the new customer is inserted:

> db.customers.insertOne({ 
    firstName: "John", 
    lastName: "O'Connor", 
    phoneNumber: "555-555-1212", 
    email: "john@compose.io"
    });
{
    "acknowledged" : true,
    "insertedId" : ObjectId("589cd56b6ca2fef0f7737fbc")
}

The acknowledge message lets us know that the customer was inserted correctly. Save the insertedId for later as we're going to use it when we make a new transaction.

Now, let's add a product and a transaction:

db.products.insertOne({  
  name: "Chocolate Chip",
  listPrice: 2.99,
  sku: 1
});

Again, make sure to keep track of the insertedId so we can use it while making a transaction.

Finally, let's add a transaction in which our new customer purchases our new product:

db.transactions.insertOne({  
  productId: ObjectId("589cd9216ca2fef0f7737fc4"),
  customerId: ObjectId("589cd56b6ca2fef0f7737fbc"),
  amount: 2.99
});

Wrapping Up

While document validations aren't necessarily desirable in all scenarios, they provide developers with a more robust set of options when deciding where they want to place the responsibility for data integrity within their applications. In this article, we demonstrated how to create collections that have validations in MongoDB to ensure our data has a predictable format and set of data. In the next article, we'll use that predictability with MongoDB aggregations to gain insights into our fictitious business by mining data in our database.


If you have any feedback about this or any other Compose article, drop the Compose Articles team a line at articles@compose.com. We're happy to hear from you.

John O'Connor
John O'Connor is a code junky, educator, and amateur dad that loves letting the smoke out of gadgets, turning caffeine into code, and writing about it all. Love this article? Head over to John O'Connor’s author page and keep reading.