Be careful what you count - Noteworthy at Compose

Published

This is your weekly summary of Compose news for those changes and updates which can make your life easier. In this edition, why counting can be tricky. We also take a look at the past week's Compose Articles.

A reminder about counting with your database

We often come across people reporting performance problems with their database. Often, the diagnosis of the problem comes down to what the user is doing with their key/value database. Specifically, we find one of the user's applications is counting every key in the database.

Now, we understand knowing how many keys in your Redis or etcd database may be an interesting statistic to have. It can indicate how loaded the system is because a large number of keys takes time to traverse. But counting them isn't the way to manage key numbers; in actuality, regularly counting keys when you have a large number of keys can starve your database of resources far quicker than any application could.

Its rare that the total number of keys translates into some useful functional data. Consider counting the number of keys which belong to a particular business unit, queue or process. That's information that does translate into functional data. It also shows your database at its best, leveraging indexes to locate that information and then counting it.

"Every key in the database" on the other hand is a pretty arbitrary metric; it could easily include data that doesn't map to a process as it's just "all the keys". And there's no escaping going to count all the keys and that will take time.

That's why on Redis, the developers specifically advise against using KEYS *, the get-all-the-keys command we often find being used to count the number of keys in the database. They don't recommend it for counting sub-sets of the data either and recommend SCAN to work through counting subsets or using sets instead. Similarly with etcd, you can request all they keys in the database, but that will take time as it has to traverse all the keys to find them to count.

Treat counting everything in your database as a "code smell". More often than not it can be the aroma of redundant tests or metric gathering which shouldn't have made it to production. If you really do need the data, pull the tests out of your application code and add them to your admin tools. Make sure you know when and how often they are being run.

If counting everything is part of your database architecture, you may want to step back and reconsider why and where you are doing that counting. You may find it more effective to count the various abstracted structures in the database and then aggregate the results of those counts.

Compose Articles

Last week saw the announcement of the retirement of MongoDB Classic at Compose. There'll be no new MongoDB Classic after August 20th. That's the first stage of a 12 month transition to close down MongoDB Classic at Compose. There were also notes on MySQL 5.7.22 and Redis security updates in last weeks Noteworthy. Finally, there was NewsBits with news from the database and developer world; new PostgreSQL tools, Go updates, Python extensions and Microsoft's Powershell for Ubuntu.

That's it for this week's Noteworthy at Compose. Onwards to next week!

Default avatar The default author avatar
The Compose Team The fine team of people at Compose have brought this article to you through teamwork. Remember, teamwork makes the dreamwork. Love this article? Head over to The Compose Team’s author page to keep reading.

Conquer the Data Layer

Spend your time developing apps, not managing databases.