RAID: Redundant Array of Inexpensive Disks
It is a common and mature technology to join multiple drives (HDDs or SSDs) to form either a redundant and/or larger pool of single logical storage. It works on lower level than the filesystem, and works for any filesystem.
A RAID array can be made with a hardware controller, or via software such as the mdadm in Linux. Commonly Seedboxes use RAID0 (if any), but we have opted to use RAID5 and RAID10.
Different RAID Levels
RAID0 stripes data across the multiple hard drives in chunks, in a manner which seeks to emphasize performance. Data is split into smaller chunks, and saved on multiple drives, so for example 512KiB chunk size the first 512KiB goes to Drive 1, second chunk to Drive 2, third to Drive 3, fourth to Drive 4, fifth to Drive 1 and so on.
What this achieves is near linear performance scaling, so with 2 drives you get double the performance, with 6 drives you get 6 times the performance, putting it in simple terms (it never is quite that simple actually). However the big drawback is that failure rates increase exponentially, and with no redundancy built in, it is always complete loss of data. The exponential curve of failure rate is the big thing here, math is 1−(1−r)^n, r being failure rate and n the number of drives. So single drive annualized failure rate maybe 2.5%, with 4 drives it is 34.3% ! and that is with high quality low failure rate drives, if the said drive is not such a great with 10% annual failure rate we have 87% chance of failure within the year!
SSDs these days should have much lower failure rates, and with extremely limited storage we have chosen to use RAID0 with daily remote backups for our SSD Seedboxes. Backups give the required data resiliency for much less cost than using one of the other RAID levels.
RAID5 stripes + parity redundancy is a scheme where 1 of the array drives worth of space is used for redundancy, using distributed parity. Any of the drives may fail, and the array is recoverable. Data is saved in the same striping manner as RAID0, so reading data performance gain is near linear just like in RAID0, but writing data suffers a penalty for calculating and saving the parity. This penalty depends on many factors, how the RAID5 has been setup, what is being written etc. It can be quite significant, or barely noticeable. Fortunately, Seedboxes tend to be read heavy, not write heavy. Rule of thumb is 95% read performance, 80% write performance.
With RAID5 array of 4 drives, you do “loose” 25% of the storage capacity, since one drive is used for redundancy. RAID5 will hence survive one drive failing, but not a second drive before the array is resynced with a fresh drive. Generally RAID5 are attempted to keep in low drive numbers such as 4 or 6 to avoid this happening. Second drive failure is most likely during the resync process as ALL data is being checked on this process to produce new parity chunks. Calculating the chances of data loss due to second drive loss is quite hard, and data is scarce. If we assume drive rebuild takes 24hours, and chances of failure for a drive is 50x greater during that period, the change is maybe a few %, on poor quality drives, which entered on duty at the same time, from same manufacturing batch, on same shipment; Yes, all of these factor in. What we like to do is try our best to mix and match drive manufacturing batches, ages and vendors(!), which gives them different durabilities and lessens the chances of second failure back to around what is the normal failure rate. Even if that happened, there is still a chance of at least partial data recovery.
You still have same chances of a drive failing during the same time, but chances of actual data loss is mitigated to tiny fraction. If each time a drive fails there is 1% chance of actual data loss (second drive failing), then the mean is that can happen 100 times before data loss, which with the 4x poor quality array means well over 100 years, or with 6 drives 50+ years.
These are reasons why we have chosen RAID5 for bulk of our Seedbox servers, it strikes a good balance between maximal storage and maximal performance.
RAID10 is a nested drive level, a mirror of stripes, or stripe of mirrors depending upon the variety. Here we will discuss about how the linux software raid mdadm RAID10 functions, instead the usual hardware variety which does not always yield the performance one expects. RAID10 is best used for maximal performance, with high degree of redundancy. With 4 drives, you can have 2 drives fail and the array is recoverable. RAID10 also achieves linear read performance scalability, and near linear write performance scalability just shy of half the raw performance (data to be written on 2 drives). This yields about 98% READ performance and about 45% write performance. Half of storage is “lost”.
However, RAID10 yields the nice thing that performance per amount of storage figure is doubled for reads, and as such is often the choice where performance is the main concern, sometimes even when redundancy is not meaningful, but you need increased performance for single set of (usually small) data. In these cases RAID10 can yield better performance gains for the user than RAID0 would! It is all about the particular use case. This holds true even for SSDs.
As such the use cases for RAID10 are varied, some want the redundancy without sacrificing performance, some want the utmost highest performance possible. The top choice for web hosting, databases, virtual machines and others where performance is primary consideration.
That’s why our Dragon series utilizes for large part RAID10, performance with decent storage.