Debunking Myth of RAID 10 as Best Practice on AWS


For the most amazing performance on AWS, use RAID 10 with provisioned i/ops (henceforth piops). A combination of RAID 10 and piops achieved read performance equaling the sum of all provisioned i/ops. Thus, 4 x 1000 piops disks achieved 4000 read i/ops.

However, we live in a world that optimizes more than one variable. There are always costs.

On AWS, we find:

Given time, always optimize database usage, and ditch RAID 10 performance requirement (/articles/mongodb-performance-schema-design-more-important-than-anything-else/)). Every checklist for AWS and MongoDB includes RAID 10. This inclusion is a relic of the past and oversells the ability of RAID 10's usefulness on the AWS platform.

We benchmark RAID 10, table scans, and optimization strategies. The optimization strategy returns 50x the performance gains.

Drive Configurations

The Gnarly Test

We are doing some truly gnarly things to MongoDB - like 66% table scans and 33% updates. Using YCSB for our benchmarking tool, our configuration looks like:


With this test, the limiting factor will be drive performance. To ensure we stretch the disk, we are using a 22GB dataset, which exceeds the RAM capacity on the server of 7.5 GB. Table scans on this test will churn the disk.


These results are awful, awful, awful! We are paying $1,976.48 per month for 38.3 ops/sec? On unrestricted SSD backed physical hardware, we did not achieve better performance.

To prove our point that optimization trumps RAID 10, we will run 2 more tests.

A Reasonable Mix

With the "Gnarly Tests", we performed 100% table scans. Now, to swing in the other direction and perform 50% inserts, 25% updates, and 24% optimized reads, and 1% table scans.

Reducing table scans to 1% table scans yielded between 9x and 21x return on performance.

An Optimized Mix

What difference is an addition of 1% table scans?

Removing all table scan yields between 58x and 113x performance over the Gnarly results. Wow! We knew table scans were significant, but this is amazing.

Straight Optimized Reads

Want to create the headline grabbing benchmarks? Use 100% optimized reads; or 50% optimized reads and 50% optimized updates. Below is 100% reads:

When performing the straight optimized reads, the limitation was still the disk. To get to the next level of performance, order more RAM. By having a dataset fully in RAM, we could hit 15,000 - 25,000 reads per second.

The Recommendations

RAID 10 is for table scans. RAID 10 with provisioned i/ops is expensive. If you are running a database requiring table scans, RAID 10 will be your expensive savior. If you have very deep pockets, or a small dataset, covering your data size with RAM is a faster alternative.

Deep pockets are a terrible answer. Get a better return on investment by running an optimized database.

Moral of the Story

Optimize your MongoDB, save money and increase performance. Relying on RAID 10 to save you from unoptimized queries is an expensive path. It will also provide with less than stellar results.

MongoHQ has the Slow Query Tracker and Profiler, which logs and records all slow queries. The profiler comes with a 1-click index create to help you optimized and often. Get started optimizing early and often.

Chris Winslett
Chris Winslett is a complex arrangement of Oxygen, Carbon and Hydrogen. Includes some salt. And beer. Love this article? Head over to Chris Winslett’s author page to keep reading.

Conquer the Data Layer

Spend your time developing apps, not managing databases.