[nycbug-talk] ZFS and firewire - conditions for a perfect storm

Miles Nordin carton at Ivy.NET
Mon Jun 30 03:23:53 EDT 2008

>>>>> "il" == Isaac Levy <ike at lesmuug.org> writes:

    il> 1) The firewire bus could possibly be loosing track of which
    il> device is which- and confusing ZFS.  In my daisy-chain setup,
    il> when one drive in the chain dies, (say, da2), and it's removed
    il> from the chain, it seems to become the previous drive
    il> (e.g. da1).

zpool export ; zpool import 

I think that will ``just work.''

    il> (Anyone know about OpenSolaris/Firewire/ ZFS?  How's that for
    il> esoteric :)

yeah, I used this.  I've used mirrors only, no raidz2.

 * I haven't fooled around with any of that OpenSolaris or Nexenta
   stuff.  I've used only Solaris 10 U<n> and various SXCE builds.

 * non-Oxford-911 case that I had, the case would crash.  The case had
   to be rebooted.  This was confusing because for a while I thought
   the driver/OS was messed up.

 * ZFS could handle a case crashing during use, but ZFS had problems
   if a case crashed during a scrub.

 * error reporting through the firewire bridge is not always
   fantastic, and smartctl would not pass through, so diagnosing
   failing disks is significantly harder when they're inside firewire

 * for mirrors, ZFS wasn't great about remembering that the mirror was
   dirty and needed resyncing.  If I rebooted during a resync, it
   wouldn't continue where it left off, and wouldn't start over---it
   would just quit trying to resync and accumulate checksum errors.
   The resync, when it did complete, often wasn't adequate to stop a
   stream of ``checksum errors'' over the next few weeks---I had to
   manually request a zpool scrub if half the mirror ever bounced.

Because of some of these problems and cost, I've moved to
ZFS-over-iSCSI.  It's very slow and has problems still, but works
better than the firewire did for me.

I think ZFS is the Future, but the more I use it the less confidence I
have in it.
