How We Scale MongoDB

We recently launched Elastic Deployments for MongoDB, which represent the best method we've found for scaling customers' MongoDB datasets appropriately. The word "appropriate" is, well, appropriate when considering scaling. A 10GB MongoDB dataset has different requirements than a 10TB dataset which has different requirements than a 10PB dataset. There aren't hard rules for the best time to scale these data sets using more complex methods.

It can be damaging (and technically and emotionally painful) to build for the wrong scale -- planning too big is most common, and means a developer having to give up nice features of the database that can help simplify or speed up development. It can often be related to the same "premature optimization" arguments that are made on the application side.

First, we scale vertically

MongoDB, like most databases, craves RAM and IO capacity. It sometimes likes CPU. The simplest conceptual way of scaling performance for a MongoDB dataset is to give it more system resources without worrying about spreading the load across servers. Normally, this is painful for cost or operational reasons. Doubling the capacity of a production MongoDB replica set means swapping larger servers in, figuring out what to do with the old ones, and hoping the new ones are just the right size to keep things healthy for a long period of time.

We tackled this problem by buying the biggest reasonable commodity servers around, run customer databases in isolated containers, and allocate system resources on the fly. When an Elastic Deployment data set grows, we give it more RAM and IO capacity in proportions that work well for the bulk of the production databases we've managed. This is an incredibly effective way to scale most MongoDB datasets that are less than 250GB in size.

Then, we scale vertically again

Sadly, MongoDB itself will usually become a bottleneck before the capacity of a server is exhausted. Write lock is almost always the biggest problem (though there are practical limits to how much IO capacity a single MongoDB process can take advantage of). When it is clear that write lock could become a factor, we gracefully convert databases to a setup we call "core sharding" -- running multiple MongoDB shards on the same physical (high-availability replica set) servers, up to one shard per CPU core. This allows MongoDB to better utilize IO, for instance, when otherwise it would be bounded by write lock.

Fortunately, we can handle most datasets up to 2.5TB on the same group of physical servers. I say fortunately because keeping a MongoDB setup operationally compact limits the potential failure modes of a production cluster. In particular, networking over the loopback adapter is fast and reliable. Layer 2 networking (even in the same local network) is a bit slower and a bit more susceptible to chaos. Degraded network performance can manifest itself as a heisenbug in a freshly-sharded MongoDB application, so we just avoid the network when it's easy and give customer applications and datasets time to mature before they need to take the next step.

Quite a bit of what we do is try to limit chaos on behalf of our customers, so when we do core shard, we spend as much time as necessary to make sure a customer's shard key decisions are the right ones for a given dataset. (As an aside, we will continue to be skeptical that good shard key decisions can be made when you are just starting out with a data set.)

Choosing the wrong shard keys for a MongoDB cluster is a nicely embossed invitation for chaos to bring a guest to a database party. Core sharding is our best opportunity to intervene and apply our experience in a way that minimizes party fouls and allows our customers to gracefully implement intelligent strategies. Each one is different ... there is no "one way" to do it and the more you take time to understand your data, the better your decisions will be.

Then, we scale out

At this point, scaling out MongoDB is easy. A well-built, sharded MongoDB dataset is easy to reason about and will scale linearly across other servers. For the most part, our customers don't know or care when physical horizontal scale comes into play. When a customer dataset is "mature" and growing, we migrate core shards to other servers behind the scenes as needed to provide the right amount of scale.

Then, Elastic Deployments make Sharding Awesome, too.

The nice thing about having a way to scale a logical MongoDB process vertically is that we can apply it to the shards in a big cluster. Under extreme load, increasing horizontal capacity in a MongoDB cluster can be difficult. Migrating data adds write load and, through Elastic Deployments, ramping up the resources on each shard is a really nice way to buy time to get past load spikes (or buy capacity to let MongoDB shard balancing do its thing).

We actually use this ability proactively, when possible. It's common for customers to hear that they're going to get a massive increase in activity ahead of time (an upcoming Google or Apple App Store promotion, for instance) and let us know. Our team simply adjusts resources behind the scenes. Once we've thrown hardware at the problem, we can either use the extra power to migrate data to new shards quicker, or just use it to absorb the load directly.

Flexibility is a requirement of an evolving data set

MongoDB offers numerous features that make developers lives easier. It also offers features for scale. Using the scaling features at the wrong time means compromising on developer-friendly features (unique constraints, oplog usefulness, capped collections). There is a great deal of pressure on developers to use the MongoDB sharding features even when they're not necessary, which makes their lives worse in aggregate. The most healthy MongoDB setups started with developers using features that helped them move faster, and evolved as understanding of the problem scope and appropriate scale increased.

For developers that use MongoDB, we help them make smart decisions and don't force them down a path before they even have a map. This, we have found, always leads to the best likelihood for success.

This post was previously a badly thought out attempt to generically talk about scaling databases vertically. Based on feedback from customers and smart people on Twitter, we felt it was best to tackle this subject in a much narrower scope rather than presenting badly communicated opinions without much substantiation.