In this new series, we'll look at each of the Compose databases, how they are configured, why we configure them the way we do at Compose and, most importantly, what that means for you. In this first part, we look at MongoDB.
MongoDB has a reputation for being easy to set up for a developer and that's because you could just quickly configure a small database. MongoDB has also been quite complicated to configure for bigger data with various scaling options. At Compose, we've been through many of the configurations and have evolved what we offer as a MongoDB database.
To understand the different MongoDB configurations, you have to start small. Specifically, at the smallest configuration.
One for developers
The smallest MongoDB configuration is the single instance of the
mongod process. Mongod is the database process and getting one running on a machine is relatively uncomplicated. That makes it quick to install and easy for developers to get started with. All the user management is there and the database's admin data is stored in the database's admin database (this is important later). The thing is, though, that a single instance is also something you would never run in production. That's because when it's down, you're down. What you want are at least two Mongo database servers.
Two for production
The first configuration which can manage two
mongod processes is the replica set. The two are bound together with one as primary and one as secondary. When the primary goes down, the secondary becomes primary and as long as your clients then reconnect to the new primary, you can carry on running. The admin data is replicated between the two servers like any other server.
The upside is it is more resilient, the downside is that you are connecting directly to the database instances and doesn't support TLS. At Compose, we understand the pros and cons of this configuration because we built what we now call Compose MongoDB Classic with two mongod replica sets. Well, almost; we added in an invisible backup node to handle backups without stopping the other servers.
We used our experience when we rebuilt the Compose platform and created a whole new class of Compose MongoDB.
This next big configuration of MongoDB is the sharded configuration. A shard takes a replica set and wraps it up with some other components. That shard is then plugged into a routing and configuration service. For client connections, the configuration adds one or more
mongos (the S is for Shard) processes for the applications to connect to.
Queries go through the
mongos process and into the routing and configuration service. That works out which shard should get the query and already knows what
mongod instance inside the shard's replica set is primary so the query goes to the right server process. This configuration takes the load off clients for selecting which server in a replica set to connect to. The extra layers in the configuration mean it can play host to many, many shards.
Or, we could use just one shard with all your data stored in its replica set. This is what we do at Compose. We adopted the architecture to isolate users from that having to switch connections you get with the replica set. We went a step further though and made sure that we also had multiple
mongos services for users to connect to so that you aren't relying on a single server for access.
This configuration also allows us, if we want to, to support more shards but the sharding of data is a very one way, tricky affair and most users prefer to have their data in one database shard, un-sharded and vertically scaled.
Whatever the number of shards, they are all managed by that routing and configuration server. It also takes care of the database's administration data in its own, smaller, database which is effectively shared with all the shards. This is also why we advise people never to store data in the admin database. It's a bit like keeping all the library's books scribbled in the margins of the visitors' log - the data is recorded but you'll run out of space very quickly and not be able to let new visitors in.
There are some things we do which aren't part of MongoDB's configurations. All the various processes of Compose's MongoDB run inside a virtual private LAN, isolated from all other databases. We do this for all our databases. Our mongos processes are fronted by HAProxy processes which manage SSL termination and whitelisting. This becomes what we call a MongoDB portal. MongoDB portals are the only thing visible from the database to the outside world. Behind the scenes, our management platform lets us (or you via our Web console) change various configurable elements, from the whitelists to the number of MongoDB portals or to resynchronize the replica set inside the shard or triggering an on-demand backup.
Although Compose for MongoDB has multiple portals, and with that multiple URLs, don't confuse it with a replica set when you are configuring your application. Compose is presenting you with multiple
mongos servers. A good driver will be able to take a list of these and failover between them.
One thing about this configuration is to bear in mind that the portal you connect to is, underneath, a mongos proxy for your data. There are various restrictions on what you can do with the proxy. Some commands like the MongoDB copyDB aren't supported. Others will need you to ask for discovery to take place inside the database, like the mongostat command, which will by default report on the mongos portal, but with
--discover will report on the database servers in the shard's replica set.
There are other utilities we've built into Compose MongoDB. Compose hosted databases can have their logs shipped out real time to log analysis companies. They can also have their metrics shipped to metrics analysis companies too. Unique to MongoDB on Compose is an oplog portal which gives access to the change stream between the two database servers in the shard's replica set.
We've configured MongoDB on Compose to be as reliable as possible with as much redundancy as we can fit in. Hopefully, this article will have given you more insight into why and how we do Compose MongoDB.
Read more articles about Compose databases - use our Curated Collections Guide for articles on each database type. If you have any feedback about this or any other Compose article, drop the Compose Articles team a line at firstname.lastname@example.org. We're happy to hear from you.
attribution Vadim Sherbakov