RAID, or “Redundant Array of Inexpensive Disks,” is the process of combining multiple hard drives or SSDs in parallel as one logical volume, making the array more resistant against drive failures. There are many kinds of RAID, and we’ll discuss which one you should choose.
What Is RAID?
Let’s say you have two 1 TB hard drives. In a normal PC, you would probably just plug them both in and have 2 TB of usable space to work with. However, in a server environment, it’s actually better to take the second disk and use it as a backup, forming a RAID array. This can be done in realtime with a RAID controller, which connects to the drives and manages the RAID array for you. There’s also software RAID, but most servers will come with a dedicated RAID controller.
Without RAID, there’s no redundancy—however, this isn’t actually the main issue. Data should never really be lost with a good backup strategy, but if you lose a drive, that server can experience some serious downtime while being restored. This is not acceptable in a server environment, and is much, much worse than a temporary data loss.
RAID arrays can be rebuilt while still being usable, and when one drive fails, you won’t have to restore from backups. This is the primary advantage of RAID arrays. Servers are designed to never go down, even for maintenance in many cases—You can literally unplug a drive from a production web server, and it will keep on chugging, albeit with lower performance.
In many ways, RAID is much better than one big drive. One large 8 TB drive is not as resilient as five 2 TB drives configured in RAID 5. You’ll be hard pressed to find a server that comes with only one drive installed.
RAID works best with identical drives. It can work with various drives, but you’ll typically be limited to the speed and space of the slowest and smallest drive, making it suboptimal.
This whole discussion really only applies if you’re managing a server yourself, like a home NAS containing many hard drives; in that case, the type of RAID you choose is very important. If you’re renting virtual servers from AWS or basically any other provider, RAID will usually be configured for you by the hosting company, as that level of control is abstracted away from you.
A note before we begin: The numbers used to designate different RAID levels don’t really mean anything. RAID 5 isn’t five times better than RAID 1. There are other odd RAID levels, like RAID 2, 3, and 4, but they’re not used in practice, and aren’t worth explaining.
This isn’t technically a RAID configuration, but it’s worth mentioning here. JBOD technically stands for “Joint Batch Of Disks,” but you can call it “just a bunch of disks,” since that’s basically what it is. JBOD simply concatenates disks together into one big disk. This doesn’t offer any performance improvement, and doesn’t have any redundancy, but it doesn’t care at all about what disks go in it.
Many RAID controllers will offer a JBOD mode. You probably shouldn’t use it, unless you’ve gotten a bunch of different sized disks and want to link them together.
Data in RAID 0 is striped across multiple drives; for example, if you want to read a file from the RAID array, you’ll be reading from multiple drives in parallel, which makes RAID 0 much faster than any single drive.
However, there’s no mirroring, parity, or other redundancy mechanism, so if a single drive fails, you lose all the data on the entire array. Because of this, RAID 0 is used when speed matters, redundancy isn’t necessary.
In a way, RAID 0 is very similar to having no RAID at all. It gives you the benefit of having all the drives in a single, large volume, as well as much higher access speeds. However, a single drive failure can be catastrophic to the data on the array, so you should never run RAID 0 without a backup solution unless the data is meant to be 100% ephemeral.
RAID 0 also maximizes capacity, since no space is used for redundancy. If you have two 1 TB disks, your array size will be 2 TB. However, RAID 0 is limited to the lowest disk size out of the array—if you try to RAID 0 a 2 TB drive with a 1 TB drive, you’ll only have 2 TB of space, with 1 TB going entirely to waste.
RAID 0 with SSDs is common, and more reasonable considering SSDs have lower failure rates. This is a common setup in high end desktop systems, since speed matters more than redundancy.
RAID 1 is another basic type of RAID. Similarly to RAID 0, it uses two or more disks, but rather than striping data across them, the data is mirrored from the first drive to the second (and any additional drives in the array). If you have two drives, one of them will be used entirely as a sort of real-time backup, halving your total storage capacity in the process. If either drive kicks the bucket, you can continue reading from the other drive, and rebuild the array by replacing the faulty drive.
This does have some read performance benefits, since two drives can be used, but since it’s reading the same data from each drive, it often isn’t as good as RAID 0. Write performance will be limited to the speed of the slowest drive.
RAID 1 is your only practical choice if you have two drives and can’t afford a drive failure taking out your data. It’s not the most efficient though, as you’re cutting your storage capacity in half, and thus it will cost twice as much as comparable single drive.
However, the redundancy in a server setting is worth much more than the price of a single drive. If you just need a basic drive setup, go with a simple RAID 1 array. Most RAID controllers will default to RAID 1 when plugging in two drives.
RAID 5 is where things start to get interesting. Rather than duplicating data like RAID 1, RAID 5 uses a much more efficient method—parity.
Parity is a form of error checking, like a hash, but much simpler. It’s commonly used to make sure network traffic doesn’t get garbled up in the wires. Basically, say you have 7 bits of data that you want to send to someone, and you want to ensure that it gets there entirely intact. If a bit got flipped in transmission, they’d have no way of knowing. The solution is to count up all of the positive bits; If there is an even number of ones, the parity will be
0. If there is an odd number of ones, the parity will be
1. You add this to the data you’re sending, and when the person on the other end receives it, they compute the parity themselves. If there’s been an error, and a bit has been flipped (even the parity bit itself), the other person will know, and request that the data be resent. Of course, if there are two errors in a single transmission, this system breaks, but that’s not as common.
Instead of storing copies of the data (which would be like sending a message twice), RAID 5 simply stores a parity bit. You can imagine it like RAID 0 with redundancy—it requires a minimum of three drives. All but one of the drives are used like a regular RAID 0 array, but the last drive is used for parity. If one of the drives goes, you can perform the parity calculation in reverse to recover all the data on any of the drives (though this is a lengthy and intensive operation).
In practice, RAID 5 doesn’t use a dedicated drive for parity, as it’s faster to stripe the parity bits across all the drives, but you can think of it this way when calculating how much space a RAID 5 array will give you. Essentially, add up all but one of your drives, and that’s how much space you’ll have. RAID 5 gets more space efficient with more drives— three drives is 66% efficiency, but 10 drives is 90% efficiency. This lowers the costs dramatically over RAID 1.
However, RAID 5 isn’t without a drawback. Since parity must be calculated whenever the drive is written to, write performance is reduced. The problem is amplified when you take into consideration the fact that flipping a single bit in one drive requires all the drives to be read from in order to recalculate parity for that block. In practice, if RAID 0 gives performance scaling with
n drives, RAID 5 gives
n - 1 performance for write operations. With a large enough array though, the problem isn’t that bad.
Also, no matter how many drives you have, you can only survive one drive failure. This wouldn’t seem like a major issue, since failures are uncommon and you’re unlikely to experience two of them at the same time, but array rebuilds can be very intensive on your drives—you’re basically reading every single bit of data off each one, at the time when they are most vulnerable. So if one of them goes, there’s a higher chance that another drive could fail as well.
RAID 5 should be your go-to option if you have three drives, as RAID 1 would be a waste of space. If you have 4 drives, it’s still probably the best option, but the other two options on this list are also available to you.
RAID 6 is like RAID 5, except the “parity disk” is mirrored. This allows your array to survive two drive failures. However, write performance is worse at
n - 2, and you will of course have less space.
There’s really not much more to say about it. If you’ve got a large array of drives, (6, 8, or more) you might want to consider RAID 6 for its extra redundancy. RAID 6 alone fulfills the first part of the 3-2-1 backup strategy—store at least three copies of your data, with two backups on different media, with at least one of those located offsite. RAID 6 can survive two drive failures, making it functionally the same as RAID 1 with three disks (minus rebuild times).
In practice, RAID 6 will almost never experience a total array failure, especially if you add more parity disks into the equation. This, combined with backups and copies in other datacenters, is how archive services like AWS Glacier and Backblaze achieve 99.999999999% durability.
RAID 10 (1+0)
RAID 10 is technically a form of Nested RAID, which is another complicated thing all its own. Basically, if you have four drives, and don’t want to use RAID 5 or 6, your only other options are RAID 0 and 1, which both have their issues. Instead, you split those drives in half, make two RAID 1 arrays, and then take those arrays and use them to make one big RAID 0 array. RAID 10 requires at least four drives, and also requires an even number of total drives.
This gives you all the benefits of RAID 1 and RAID 0 without many downsides—fast read speeds, fast write speeds, high redundancy, and easy rebuilds, while still being able to use half of the total space of all of your drives. RAID 10 is actually more resilient than RAID 1. In the diagram above, Disk 1 and Disk 3 could fail, and the array could still be fully rebuilt (though if both Disk 0 and Disk 1 fail, that array is unrecoverable).
RAID 10 is a very common RAID level for servers. It’s very fast, and can survive one drive failure at the minimum. The only real issue is the price, as you’re still paying double to keep copies of all your data, but for general workloads, RAID 10 beats out nearly every other RAID configuration for speed, losing only to RAID 0 for throughput.
RAID 50/60 is basically two RAID 5 or 6 arrays in RAID 0. This improves performance just like RAID 10, most importantly improving write performance, since reading from the other drives when calculating parity is faster.
It requires at least six drives (eight in the case of RAID 60), and since there are separate RAID 5 arrays, you’ll need additional parity drives, making it less space efficient, but a bit more resilient. Overall, RAID 50 is basically a more performant version of RAID 5.