The Coming of the MongoDB Storage Engines

We're around half way to MongoDB 2.8, at least in terms of version numbers with the latest development version, 2.7.6, imminent. It's also some months since MongoDB's Eliot Horowitz outlined what was coming for MongoDB 2.8 so now is a good time to review how some of those features are coming along and what we can expect to be delivered.

Outlet

The new feature with the most potential is pluggable storage modules/engines in MongoDB. Pluggable storage engines means that the boundary between the database and the way it stores files becomes more rigorously defined so that developers can replace the way data is stored with an entirely different system.

In MongoDB's case, historically it has used memory mapped files for its on-disk storage. This technique hands responsibility for pulling in and pushing out pages of data from memory to whatever operating system is running MongoDB. It's easy to implement but it does mean that whatever limitations, nuances and variations of the operating system bubble up into MongoDB. That said, the folks at MongoDB have hammered away at it to get it to run as efficiently as possible. That memory mapped storage system will be the default engine loaded by MongoDB 2.8.

In-memory databases have been making their mark and the first thing that the pluggable storage engines allow is to turn MongoDB into one. With no disk persistence, the database is freed from the slow I/O of spinning media or the faster I/O of SSDs and gets to whiz along in main memory. You of course need sufficient main memory and an architecture built with the transient nature of the database in mind. Although there is an in-memory storage plugin, we were unable to get it up and running for this article, but it is expected to be an option when MongoDB 2.8 arrives.

What's more interesting is the possibility of plugging in a completely different storage system. The one that is being actively developed for MongoDB 2.8 is RocksDB. You may not know RocksDB but it has quite the genealogy. It started life at Facebook and has borrowed code form Google's "leveldb" project and borrows ideas from Apache's HBase. The idea driving it is to create a high performance key/value store thats makes good use of the high read/write rates that you can get from SSDs and RAM based storage. There's lots of configuration settings available which let it be tuned to the media its working with and whether compression algorithms should be applied to the data. There's a lot of potential there and its a logical choice as the first alternate storage engine for MongoDB to be implementing.

Building with RocksDB

But we are only half way through development. With the most recent development release, MongoDB 2.7.5, the RocksDB plugin has not been marked as ready for testing. If you want to build it yourself, here's the steps we had to go through on Mac OS X 10.9. We're assuming you already have the appropriate XCode and command line tools installed...

First, we needed to clone the Github repositories for MongoDB and for RocksDB

$ git clone https://github.com/mongodb/mongo.git
$ git clone https://github.com/facebook/rocksdb.git

Then we needed to build RocksDB.

$ cd rocksdb
$ make static_lib
$ cp librocksdb.a /usr/local/lib
$ cp -r include/* /usr/local/include

With RocksDB built, we can now move on to building MongoDB. This uses the Scons build tool:

$ cd ..
$ cd mongo
$ scons --rocksdb --rocksdb --libc++ --osx-version-min-10.7 mongo mongod

You'll want to go make a cup of coffee while that does its thing. You can speed it up using "-j n" as an argument - that kicks off parallel builds using "n" simultaneous jobs at a time. You can set that to slightly more than the number of cores you system has to ensure a job is always queued up. You'll still have plenty of time for that coffee anyway.

When it's completed building, you'll have a mongo and mongod binary in the directory. Remember to NOT install them. They are development builds and within them lie horrors, bugs and potential brokeness. For your experiments, you should make a local data directory and run the commands from within the Mongo directory.

$ mkdir ./data
$ ./mongod  --dbpath ./data --storageEngine=rocksExperiment

Just to remind you how experimental this all is, even the name of the storage engine has the word Experiment in it. This should start running and listening for connections. In another session or window, cd to the same directory and you should be able to run

$ ./mongo

and interact with the database...

So, now you've built it, you'll probably want to benchmark it. We leave it to your own needs to come up with what benchmark you want to use. But don't expect huge performance improvements with Rocks right now. In our "Bulk insert a million address records into a completely empty database" test, a very not-a-benchmark test, with the traditional memory mapped files, the task took around 55 seconds. With RocksDB as the storage engine, that went up to 98 seconds. Not unexpected, especially as the RocksDB storage engine hasn't been marked ready for testing and our test is far from comprehensive. In standalone benchmarking and other comparisons RocksDB shows excellent performance so there's plenty of power for MongoDB to exploit.

What the test does show is the progress being made in making pluggable storage engines for MongoDB work. When they do arrive in MongoDB 2.8, we should be be able to start profiling them against a full spectrum of tests and see which ones suit which tasks, how resilient they are and what bugs and issues they shake out of the MongoDB woodwork. The in-memory and RocksDB plugins should be just the first in a range of MongoDB developed and third-party developed database plugins - InnoDB is an obvious candidate as is TokuKV, but there's also the promise of integrating storage with distributed or journaled file systems. MongoDB is going to become a whole lot more flexible in the future.