[nycbug-talk] ZFS and firewire - conditions for a perfect storm

Miles Nordin carton at Ivy.NET
Mon Jun 30 16:25:54 EDT 2008


>>>>> "il" == Isaac Levy <ike at lesmuug.org> writes:

    il> [root at blackowl /usr/home/ike]# zpool export Z cannot unmount
    il> '/Z/shared': Device busy

maybe this is the freebsd version of 'no valid replicas', the generic
banging-head-against-wall message Solaris gives you when it's trying
to ``protect'' you from doing something ``dumb'' like actually fixing
your fucked-up array.

you can try erasing zpool.cache and then 'import -f'.

    il> - Or, sometimes it just hangs like I described previously.

I find 'zpool status' hangs a lot.  A status command should never
never never cause disk I/O or touch anything that could
uninterruptable-sleep.  Especially, a system-wide status command needs
to not hang because one pool is messed up, any more than it's
acceptable for failures in one pool to impact availability of the
whole ZFS subsystem (which AFAIK they correctly don't spill over, in
terms of stable/fast filesystem access to pools other than the one
with problems.  but for 'zpool status', they do, so if you consider
the zpool command part of the ZFS subsystem then they do spillover.)

    il> + Again, and after digging around lists online, this one leads
    il> me to believe that the only people who've done a great job
    il> implementing firewire is Apple, (it's theirs to begin with).

I just tried it, and smartctl doesn't work for me over firewire on
Apple either.  I'm using the smartctl in NetBSD pkgsrc and Mac OS
10.5.3.  I think it's a limitation of the firewire bridge chip, not
the OS's driver stack.  well...it is a limitation fo the OS stack in
that there's no defined way to pass the commands through the bridge,
so the OS doesn't implement them, but the real limitation is in the
bridge chip and the standards that define how they should work.  

i think.  It's odd that DVD burners ``just work'' i guess.  but...i
bet, for example, those special commands one can send to Lite-On
drives to make them rpc1 so dvdbackukp works better, would not pass
through a firewire bridge.  untested though.

of course the error reporting stuff may be a different story, may
actually be firewire stack problems, but again I would expect the case
to interfere with error reporting and some cases to handle disks going
bad better than others.

    il> -- I believe for any future growth at home, I'll simply start
    il> thinking towards using SATA and known good controllers,
    il> (Areca, 3ware, Adaptec, etc...).

from what I've heard/understood, be sure to get a battery because it's
necessary for correctness, not just for speed.  Otherwise you need to
do RAID3 which means you need a filesystem that supports large sector
sizes which you don't have.

Another thing to worry about with this RAID-on-a-card crap is
controllers going bad.  If I were using such a controller rather than
ZFS, I'd buy a spare controller and put it on the shelf (in case the
model which understands my RAID metadata goes out of production), and
I'd test the procedure for moving disks from one controller to another
BEFORE the controller breaks, and BEFORE putting any data on the
raidset.

    il> Yeah, I think ZFS is the future too- and is simply a matter of
    il> time and maturing.

yeah, but it's really not maturing very quickly at all compared to
SVM, LVM2, ext3, HFS+, netapp/emc/vendorware storage stuff, or
basically anything at all that's not dead-in-the-water abandonware
like FFS/LFS/RAIDframe.  It seems to be maturing at about the same
speed as Lustre, which is too fucking slow.  I don't know what the
hell they _are_ working on, besides this stability stuff.  If I had a
Sun support contract I'd have opened at least five big fat bugs and
would be pestering them monthly for patches.  There are known
annoying/unacceptable problems they are not fixing after over two
years.  When Solaris 11 ships it is still oging to be rickety flakey
bullshit.  It's not exactly a disappointment, but it IS flakey
bullshit.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 304 bytes
Desc: not available
URL: <http://lists.nycbug.org/pipermail/talk/attachments/20080630/9afbd9a2/attachment.bin>


More information about the talk mailing list