[nycbug-talk] ZFS and firewire - conditions for a perfect storm
Isaac Levy
ike at lesmuug.org
Mon Jun 30 19:42:54 EDT 2008
So, I think I'm coming to a modified marketing slogan for ZFS.
"ZFS Likes Cheap Disks, especially SATA/PATA, not so hot for firewire,
and who knows about USB",
On Jun 30, 2008, at 4:25 PM, Miles Nordin wrote:
>>>>>> "il" == Isaac Levy <ike at lesmuug.org> writes:
>
> il> [root at blackowl /usr/home/ike]# zpool export Z cannot unmount
> il> '/Z/shared': Device busy
>
> maybe this is the freebsd version of 'no valid replicas', the generic
> banging-head-against-wall message Solaris gives you when it's trying
> to ``protect'' you from doing something ``dumb'' like actually fixing
> your fucked-up array.
>
> you can try erasing zpool.cache and then 'import -f'.
>
> il> - Or, sometimes it just hangs like I described previously.
Cool- thx for the heads-up on this approach, I'm learning a lot more
about ZFS... (stuff I didn't necessarily want to know :)
However, for the record here, I just tried unplugging a drive as
before (to bring on a disk I/O hang), deleted the zpool.cache, and
tried 'import -f' - and it's all just hung.
The OS keeps chugging along nicely though, (UFS2 on an internal disk).
/me sighs, reboots, and starts fresh again...
>
>
> I find 'zpool status' hangs a lot. A status command should never
> never never cause disk I/O or touch anything that could
> uninterruptable-sleep. Especially, a system-wide status command needs
> to not hang because one pool is messed up, any more than it's
> acceptable for failures in one pool to impact availability of the
> whole ZFS subsystem (which AFAIK they correctly don't spill over, in
> terms of stable/fast filesystem access to pools other than the one
> with problems. but for 'zpool status', they do, so if you consider
> the zpool command part of the ZFS subsystem then they do spillover.)
>
> il> + Again, and after digging around lists online, this one leads
> il> me to believe that the only people who've done a great job
> il> implementing firewire is Apple, (it's theirs to begin with).
Oy- you are correct here Miles!
On an Apple machine, using a firewire disk, after installing
smartmontools, I can't get even a lick of info out of the firewire
drive:
plumb:~ ike$ smartctl -a disk8
smartctl version 5.38 [i386-apple-darwin9.3.0] Copyright (C) 2002-8
Bruce Allen
Home page is http://smartmontools.sourceforge.net/
Smartctl open device: disk8 failed: Operation not supported by device
plumb:~ ike$
--
And using apple's diskutil, good stuff like SMART isn't supported:
plumb:~ ike$ diskutil info disk8
Device Identifier: disk8
Device Node: /dev/disk8
Part Of Whole: disk8
Device / Media Name: WiebeTech
Volume Name:
Mount Point:
Partition Type: GUID_partition_scheme
Bootable: Not bootable
Media Type: Generic
Protocol: FireWire
SMART Status: Not Supported
Total Size: 931.5 Gi (1000204886016 B) (1953525168
512-byte blocks)
Free Space: 0.0 B (0 B) (0 512-byte blocks)
Read Only: No
Ejectable: Yes
Whole: Yes
Internal: No
OS 9 Drivers: No
Low Level Format: Not Supported
plumb:~ ike$
--
Wow. Firewire is kindof making me sad.
>
>
> I just tried it, and smartctl doesn't work for me over firewire on
> Apple either. I'm using the smartctl in NetBSD pkgsrc and Mac OS
> 10.5.3. I think it's a limitation of the firewire bridge chip, not
> the OS's driver stack. well...it is a limitation fo the OS stack in
> that there's no defined way to pass the commands through the bridge,
> so the OS doesn't implement them, but the real limitation is in the
> bridge chip and the standards that define how they should work.
>
> i think. It's odd that DVD burners ``just work'' i guess. but...i
> bet, for example, those special commands one can send to Lite-On
> drives to make them rpc1 so dvdbackukp works better, would not pass
> through a firewire bridge. untested though.
>
> of course the error reporting stuff may be a different story, may
> actually be firewire stack problems, but again I would expect the case
> to interfere with error reporting and some cases to handle disks going
> bad better than others.
>
> il> -- I believe for any future growth at home, I'll simply start
> il> thinking towards using SATA and known good controllers,
> il> (Areca, 3ware, Adaptec, etc...).
>
> from what I've heard/understood, be sure to get a battery because it's
> necessary for correctness, not just for speed. Otherwise you need to
> do RAID3 which means you need a filesystem that supports large sector
> sizes which you don't have.
Ah- well, it depends on the controller- whole other thing.
I meant that I'd snag some fairly inexpensive and well supported SATA
cards with lots of ports, and use them for ZFS volumes- and ditch
firewire. ZFS doesn't seem to have these gross problems at all with
the SATA stuff I've used- (Areca, Adaptec, 3Ware).
And yeah I agree- don't skimp on the batteries for a given controller
if you use it for hardware RAID :)
>
>
> Another thing to worry about with this RAID-on-a-card crap is
> controllers going bad. If I were using such a controller rather than
> ZFS, I'd buy a spare controller and put it on the shelf (in case the
> model which understands my RAID metadata goes out of production), and
> I'd test the procedure for moving disks from one controller to another
> BEFORE the controller breaks, and BEFORE putting any data on the
> raidset.
Buying cards to put on the shelf is actually a plan I've put in action
several times in recent years- (after getting stuck with ancient and
irreplaceable Compaq cards going bad...)
A trend I like seeing recently, which changes this game, is that
Supermicro and Tyan server motherboards are coming with 8 SATA ports
onboard, with something like an LSI card built-in. For the 1u high-
density boxes I tend to deploy for jobs, they get deployed in pairs or
triples- and usually some component failure happens either immediately
(warranty replacement) or well after the working life of the machines
is past (3-4 yrs). I've rarely seen the machines/cards/etc fail in
the middle space, but that's just my experiences...
>
>
> il> Yeah, I think ZFS is the future too- and is simply a matter of
> il> time and maturing.
>
> yeah, but it's really not maturing very quickly at all compared to
> SVM, LVM2, ext3, HFS+, netapp/emc/vendorware storage stuff, or
> basically anything at all that's not dead-in-the-water abandonware
> like FFS/LFS/RAIDframe. It seems to be maturing at about the same
> speed as Lustre, which is too fucking slow. I don't know what the
> hell they _are_ working on, besides this stability stuff. If I had a
> Sun support contract I'd have opened at least five big fat bugs and
> would be pestering them monthly for patches. There are known
> annoying/unacceptable problems they are not fixing after over two
> years. When Solaris 11 ships it is still oging to be rickety flakey
> bullshit. It's not exactly a disappointment, but it IS flakey
> bullshit.
Hrmph. Yeah, I do worry about things maturing fast enough to stay
alive long term. With disks, buggy crap like this have to go away
really FAST or else users will...
Rocket-
.ike
More information about the talk
mailing list