Double your RAID1!
Over the years I’ve tended to prefer RAID5 as this has always seemed to be the recommended solution when it comes to combining resilience and speed, however recently, with larger less reliable drives I have noticed a few problems with RAID5. While looking at these problems, I’ve also noticed RAID5 is no longer the recommended solution!
RAID6 seems to be preferred over RAID5, but what really caught my eye was RAID10 .. or RAID1+0 ..
The problem with RAID1 (i.e. a pure two disk mirror) is that in addition to the resilience of having two identical disks, there is a tendency to think ‘great, two disks in parallel, twice the performance!’. Unfortunately, no so! Whereas RAID5 stripes data between disks to obtain a performance boost, effectively reading data in parallel from multiple drives, RAID1 does not and simply reads from one of it’s two drives.
Although there is (apparently) some logic to this, it does seem rather wasteful, after all, if the data you want is mirrored on two drives, why can’t you extract some from each drive in half the time it would take on one drive?
Apparently the answer is RAID10, which (apparently) is the same as RAID1+0 , which is a RAID build from 4 disks, each pair runs in a RAID1 configuration, then a RAID0 is applied on top. (the RAID1 provides the resilience and the RAID0 provides the striped reading facility) Wouldn’t you know it, Linux does support RAID10, on two and four disk combinations (and quite possibly more). Unfortunately it doesn’t work as expected! Certainly in all the combinations I’ve tried I can’t get anything close to double the speed of a single disk, which is really what I was looking for!
So, it must be possible .. yes?
The Solution (or one of many possible)
First, I’m going to illustrate how to do this via the Ubuntu server installer, given a RAID1+0 root partition ought to be a desirable solution for any Linux user with two or more disks.
- Run through the installation until you hit the disk partitioner
- Create 4 partitions, each half the size of the total size of the required root partition
- Use Alt-F2 to acquire a new Console and hit return to get a prompt
- (I’m going to assume sda2, sda3, sdb2 and sdb3 as the four devices)
- Now create the two RAID1 partitions, note the device order, this is critical;
mdadm –create /dev/md1 –level=1 -n2 –chunk=256 /dev/sda2 /dev/sdb3
mdadm –create /dev/md2 –level=1 -n2 –chunk=256 /dev/sdb2 /dev/sda3
- Now create a RAID0 on top;
mdadm –create /dev/md0 –level=0 -n2 –chunk=256 /dev/md1 /dev/md2
- Watch the RAID rebuild until you are all sync’d up (cat /proc/mdstat)
- Switch back to the installer, go back one stage, continue using /dev/md0 as your root partition
So, how did we do, here’s the performance of a raw drive;
# dd if=/dev/sda of=/dev/null bs=1M count=10000 10000+0 records in 10000+0 records out 10485760000 bytes (10 GB) copied, 75.6593 s, 139 MB/s
And now on the RAID0+1;
# dd if=/dev/md2 of=/dev/null bs=1M count=10000 10000+0 records in 10000+0 records out 10485760000 bytes (10 GB) copied, 54.3901 s, 193 MB/s
So where now, we’re obviously reading off two drives, but it looks like after reading a stripe from one drive, there isn’t always quite enough available from the second driver to allow the process to ‘stream’. Now I’m guessing this will be dependent on the hardware in question with regards to the size of buffers, block sizes etc, but for my disks I’ve found that increasing the read-ahead on the actual devices is the key .. and the read-ahead needs to be at just the right value for that optimal performance.
After a little experimenting I came up with an optimal figure of 512 and inserted the following lines into my /etc/rc.local;
blockdev --setra 512 /dev/sda blockdev --setra 512 /dev/sdb
Now, following a cold boot, I get the following;
# dd if=/dev/md2 of=/dev/null bs=1M count=10000 10000+0 records in 10000+0 records out 10485760000 bytes (10 GB) copied, 36.9388 s, 284 MB/s
And I can still pull a disk and the system carries on running quite happily, albeit with a reduced disk IO speed …
- Peppermint 4 release announced – June 14, 2013
- SPOF #2 – Clustered Filesystem
- SPOF #1 – Storage Node
- Meet the MintBox
- Groklaw Articles Ending on May 16th
3 Responses to “Double your RAID1!”
Leave a Reply
You must be logged in to post a comment.