[nycbug-talk] FreeBSD software RAID?

Peter Wright pete at nomadlogic.org
Thu Mar 8 13:40:21 EST 2007


>>>>>> "sk" == Steven Kreuzer <skreuzer at f2o.org> writes:
>>>>>> "gr" == George Rosamond <george at ceetonetechnology.com> writes:

>
> yeah maybe.  again, I'm heavy on ranting and short on experience, but
> at least going from my _friends'_ experience with hardware RAID, I
> intend to stay the hell away from any RAID-on-a-card, period.
>

That's crazy, hardware RAID is by far the prefered method for implementing
disk redundancy.  Aside from the fact that it allows you to offload the
RAID management and logic to hardware dedicated to the task - most decent
controllers offer BBU's (Battery Backup Units) which not only allow better
I/O rates but also help prevent loss of data during a catastrophic event. 
with out this, the only way one can garunetee that data makes it to disk
is to use synchronous writes, this will hinder disk I/O substantially. 
This is not possible with software RAID implementations - as your are
dependent upon the health of the OS to ensure data makes it to disk.  You
think vendors like NetApp/EMC/IBM/etc. use software to implement low level
RAID functionality?

> First, many of them don't have an NVRAM.  Some have something they
> call an NVRAM, but they use it to store metadata, not for a write
> cache to plug the RAID5 write hole.  This is the whole reason for
> doing hardware RAID: to get that NVRAM to fix the RAID5 write hole.
>

OK, let's be careful here, there is the configuration data that the
hardware raid controller knows about which will often be stored on the
cards NVRAM.  then there is the filesystem metadata - which is stored on
disk.  it's completely possible, and reasonable, to swap hardware RAID
cards and configure them with the same RAID configuration and have your
data on disk be intact.


> Second, there are too many horror stories of RAID cards losing entire
> arrays.  The card goes bad or gets confused.  It's part of Dell's
> card-of-the-month club, and a replacement card is unobtainable, and
> new cards won't work with the array.  Or the array's metadata was
> stored on the old card's so-called-but-not-really NVRAM, so the new
> card understands the old array but won't recognize it.  or the
> configurator tool is clunky and buggy and won't give back your array,
> or there's more than one configurator like one in BIOS and one in DOS
> and one in Windows, and only one tool works and the others are decoys,
> or whatever.

I do not know of any production class RAID controllers that store all
metadata for files on the controller itself.  Maybe I'm not reading this
correctly though...


>
> With software RAID, you can back up your metadata on _paper_ if you
> want to, and type it in by hand---the array will still work.  If
> you're concerned about your method of paper backup, you can test it on
> a non-live filesystem.  Deliberately delete/confuse your metadata, and
> force-recreate the array, see if it passes fsck and 'pax -r . >
> /dev/null'.  Keep trying until you have a written procedure that
> works.  Label the physical disks with their names on the sheet of
> paper (so you've recorded their stripe ordering).  so there is less
> possibility software RAID will refuse to see your array because some
> little pointer block got mangled, than with the card-RAID.  And you
> don't have to worry about multiple opaque configurator tools---there's
> just one, and it's native to the OS, and it's available on the
> LiveCD/installCD/whatever.
>

yea right, let's try typing in the metadata for a 2TB volume :)  a better
solution may be to back up your metadata to some sort of digital archive
medium (tape/dvd etc) - but i don't even know of any software raid
implementations that allow you to store you filesystem metadata outside of
the raid array.  this is something that is often done with hardware raid
controllers - in fact it's a recommend configuration for SAN's that do
high I/O from multiple clients.  reading and modifying metadata is a
pretty expensive operation.

i think you may be thinking about your software raid configuration data
here not metadata...

<snip>
>
> I'm sure a bunch of people can chime in and say ``I've used
> RAID-on-a-card, and I can't stress enough how close to zero is the
> number of problems I've had with it.  It is really close to zero.
> It's so unbelieveably close to zero, it IS zero, so I think it must be
> very trustworthy.''  Well, that's great, I'm just saying I've heard
> more than one story from someone who HAS had some stupid problem with
> some expensive RAID-on-a-card that they really shouldn't be having.
>

i'd be willing to bet any problems people have had with hardware RAID may
have been due to misconfiguration of the array itself, or a
misunderstanding about the fundamentals of configuring RAID.


<snip>

> A mirror is also very nice for snapshots.  You can break the mirror,
> do something dangerous, and then resync it only if you succeed.
> Sometimes either side of the mirror is bootable, so that's extremely
> nice.


snap shotting and RAID/mirroring/etc are two completely independent
concepts.  granted most people will need to implement some sort of RAID
implementation when doing snap shotting due to the amount of data you will
be generating.  snap shotting allows an admin to take an image (a snap
shot) of the current state of a file system and store it in a read-only
location on your volume/disk.  many people will use this in addition with
traditional backup policies, as a "nearline" backup for example - or even
take a snap shot of a volume then back data that up rather than the live
data.


-pete

-- 
~~oO00Oo~~
Peter Wright
pete at nomadlogic.org
www.nomadlogic.org/~pete
310.869.9459



More information about the talk mailing list