ZFS RAID - which to choose ?
Parity protection is a common technique for reliable data storage on mediums that may fail (HDDs, SSDs, storage servers, etc.) Calculation of parity uses tools from algebra in very interesting ways, in particular in the dual parity case. - How RAID-6 dual parity calculation works
Even today, though, ZFS has one major disadvantage over something like Linux’s software RAID system. ZFS doesn’t allow you to add additional drives to an existing zpool.
Alternatives
- Mirroring / HN
- you can create 3-way mirror - requires 3 disks and 2 may fail - still no data lost
- you can create 4-way mirror - requires 4 disks and 3 may fail - still no data lost
- … (read will also be sharded accross disk).
RAIDZ2 / RAID6
Fault-tolerant—can withstand up to any 2 disk failure from the array
The minimum number of drives where RAID-Z2 makes sense is four , which gives you the effective storage capacity of two of the drives and allows losing any two of the drives. This is advantageous over using the same four drives in a 2x2 mirror setup, because if you have two mirror vdevs of two drives each and lose some combinations of two drives, that vdev is dead and takes the pool with it in its fiery demise.
If you want three drives ZFS, and want redundancy, set them up as a three-way mirror vdev.
Growing ZFS can be inefficient and costly
You can’t yet add disks to an existing ZFS zpool. If you create a RAID-Z2 with 6 disks, it will always have 6 disks.
You have two upgrade paths when you need more space.
-
You can replace all 6 disks with larger disks. You swap out one disk at a time, then let the RAID rebuild before swapping the next. After swapping the last disk, your zpool will automatically grow.
-
You can also create a second zpool. You might do this by creating another RAID-Z2 with 6 more disks. When you do this, you will have two disks’ worth of redundancy in both RAID-Z2 arrays. This is a little wasteful.
RAIDZ / RAID5
Fault-tolerant — Can withstand up to any 1 drive failure in the array.
Disks are ten times larger now, so you really want to have two disks worth of redundancy in each array—this really goofs up the math on my RAID 5 examples above, doesn’t it? This means you should be using at least RAID 6, RAID-Z2, or RAID 10.
RAID 10
Adding a disk to a Linux RAID 10 array - you have to create them with near-copies.
If you have two disks, Linux’s RAID 10 operates just like a RAID 1 mirror. You can lose either disk, and you won’t lose any data.
If you have an odd number of disks, like the three disks in my current array, Linux’s RAID 10 implementation will stagger your data. The first block will be on disks one and two, the second block will be stored on disks three and one, and the third block will be stored on disks two and three. With three disks, I can lose any single disk without losing my array.
Things will be safer once I reach four disks. At that point, I will be able to lose up to two disks without data loss. Unlike RAID 6, which two disks I lose is extremely important! Your odds of surviving the loss of two disks in a RAID 10 increase with drive count. With RAID 6, you can lose any two disks without losing data.
Isn’t your cost per terabyte higher with RAID 10? - Yes, and the comparison will continue to get worse as my array grows.
- On day one, it cost me $280 to have 4 TB of usable space. Assuming I used 4 TB drives, and I would have, it would have cost $540 to have at least that much space available with RAID 6
- Next year, I expect to have 8 TB of usable RAID 10 storage, and it will have cost me $460 or less.
- As far as storage is concerned, at four drives, RAID 10 and RAID 6 would a tie.
- Starting with the fifth drive, RAID 6 would have netted me 50% more storage per additional drive.
RAIDZ3
4 and 5 disk RAID-Z3 pools seem small. 7 seems borderline (not much different than just going with RAID 1).
The Solaris Internals Wiki recommends that RAIDZ3 configurations should start with at least 8 drives (5+3)
Note
-
Your RAID 6 or RAID-Z2 array can lose any two disks without compromising your data. Your RAID 10 array is effectively built out of mirrored pairs. - Choosing a RAID Configuration For Your Home Server
-
A resilver isn’t more likely to kill an existing drive than a scrub. And the general advice is to do those semi regularly. - HN
-
We (rsync.net) have several PB of raidz3 deployed all over the world. - rsync HN