[Semibug] RAID 0 or 1 for OpenBSD

Sun Jul 25 13:28:46 EDT 2021

On 7/8/21 3:37 PM, Jonathan Drews wrote:
> Hi:
> 
>   I looked for a RAID 0 or 1 storage solution at CDW
> https://www.cdw.com/. The online tech person said that they don't have
> any RAID arrays for OpenBSD. I then did $ apropos raid, on my OpenBSD
> and looked at the resulting man pages. Some of those controllers seem
> really antiquated.
>   I want to use RAID to speed up my backups. Should I just use softraid
> (man -s 4 softraid)? The raid would be connected to my laptop(s) by
> USB. I want to achieve faster backups. My princ

(and yes, again late to the party, but I got opinions here. :) )

As Anthony suggested, what you gain from RAID0 is probably not worth
it.  Often your backup performance is limited by something OTHER than
just disk speed.

I've used disk-to-disk backup for many years, and told myself once that
"RAID wasn't needed, because it's [get ready for it] Just A Backup!"

Well, of course, when something breaks, your "Just A Backup" isn't your
first thing to fix.  But that, of course, means you have NO backups.
And then something else breaks...and you got nothing.

Plus, there are different kinds of backups -- the backup you go to when
your system dies today, and the only reason you care about yesterday's
backup is when it turns out today's didn't work.  Then there are archive
backups, where you want to retain data for YEARS, not days.  Those better
be on SOME kind of redundant storage -- not RAID 0.

If performance is a problem, I'd suggest looking at some kind of
incremental, synthetic full backup system, like my favorite
"rsync --link-dest" system.  Which I have offered to reprise, and I
do have a really cool addition to it now -- File Alteration Reporting
Tool, to look for unexpected changes in files.  Not only are your
backups fast, they are also really really useful, much more so than
most other kinds of backups.

I'm firmly on the side of "take no absolute side" when it comes to HW
RAID vs. SW RAID.  A few thoughts on the differences:

HW RAID:
* Often easier to set up.
* You MUST have spare hw on site -- if it is your RAID controller that
dies, your data will not be retrievable without a compatible RAID card.
If the industry has moved on, and your PCI RAID card can't be plugged
into a modern PCIe only machine, you may have trouble recovering your
data.
    * By spare hw: I mean IDENTICAL to the production HW.  Same ROM
      revisions and all.  Yes, I've seen that matter.
* Almost always restricted to two drives when running RAID1

SW RAID:
* Better OS compatibility, more hw flexibility.
* "just works" on almost any HW.
* More complicated and "OS Dependent" to set up.
* Often supports a 3 drive RAID1

DOESN'T MATTER EITHER WAY:
* You must recognize that setting up a RAID array is not the issue,
you don't get to pat yourself on the back until you have done realistic
failure and recovery tests.  You need a spare drive to rebuild to.
(i.e., one that has never seen your RAID card, or has been zeroed since
being seen by it)
    * That "easier to set up" benefit of HW RAID can vanish really
      quickly here.  Or not. Some HW raid is really easy-going about
      replacing drives.  Others are a nightmare,
* You must have some kind of monitoring in place to detect and report
a failed drive. COULD be regularly walking past the machine and
verifying all drives are blinking, listening for a beep, whatever.
But SOMETHING has to be able to say "THIS machine has lost a drive
and you need to fix it", and be noticed in a timely manner.

Many software RAID systems support something people often don't realize
is "a thing" -- three (or more!) drive RAID1.  No, RAID1 (in concept) is
not limited to two copies of the same data.  Three drive RAID1 is a
wonderful thing -- lose a drive?  You still have two chances to pull data
back from the remaining drives.  I suspect most people who have worked
a few years in IT will have a horror story about a server that lost a
drive, and when trying to rebuild the array, they found a bad spot on the
remaining drive, too.  It's rare that it happens in an undetected way, but
it does (save me the ZFS lovefest.  ZFS creates its own issues which I
have had more problems with than I've had with surprise bad data on
thought-good disks).  For unknown reasons, I've found this feature on
only one (no longer available) HW RAID solution, but it is seemingly
common on SW RAID.

I'm going to boringly re-iterate the importance of real-life recovery
testing with RAID before going into production.
* Replace the drive
* Replace the controller
* Replace the rest of the box
* What if the new drive is different than the old drive? (generally, not
TOO big a problem, but you might end up having to replace an 6T disk with
an 8T disk when you discover the new 6T disks are a few sectors smaller
than your old one.  How does your RAID system deal with that?  Can you
pad the old disks when you create the array so that maybe 5% is
unallocated in case the new drives are a hair smaller when you need one?
(and yes, at least one HD manufacturer has had multiple drives with the
EXACT same model number but different sector counts on the drive!)

Nick.