From skreuzer at exit2shell.com Sun Jun 1 19:28:58 2008 From: skreuzer at exit2shell.com (Steven Kreuzer) Date: Sun, 1 Jun 2008 19:28:58 -0400 Subject: [nycbug-talk] Breakout sessions at NYCBSDCon Message-ID: <20080601232858.GA96344@scruffy.exit2shell.com> Greetings- In case you are not aware, we have been working with the BSD Certification team to give an exam at the conference in October. We will have a classroom at the conference that will be used to hold breakout sessions to help let people brush up on some of the topics that will be covered on the exam. We are currently looking for individuals who can give 30 minute refresher courses on the following topics: 1. Subnetting 2. tar/pax/cpio 3. Regular Expressions 4. pf (however, the exam is more on recognition than hands on) 5. Permissions (octal, symbolic, chmod, umask, etc.) 6. Basic Unix tasks (cron jobs, symbolic links, simple sh scripts, etc) 7. rc and sysctl If you would like to volunteer, please attend the next NYCBUG meeting on June 4th and come and find me so we can work out the details. Thanks -- Steven Kreuzer http://www.exit2shell.com/~skreuzer From george at ceetonetechnology.com Mon Jun 2 00:13:08 2008 From: george at ceetonetechnology.com (George Rosamond) Date: Mon, 02 Jun 2008 00:13:08 -0400 Subject: [nycbug-talk] Breakout sessions at NYCBSDCon In-Reply-To: <20080601232858.GA96344@scruffy.exit2shell.com> References: <20080601232858.GA96344@scruffy.exit2shell.com> Message-ID: <48437354.30401@ceetonetechnology.com> Steven Kreuzer wrote: > Greetings- > > In case you are not aware, we have been working with the BSD Certification team > to give an exam at the conference in October. > > We will have a classroom at the conference that will be used to hold breakout > sessions to help let people brush up on some of the > topics that will be covered on the exam. > > We are currently looking for individuals who can give 30 minute refresher > courses on the following topics: > > 1. Subnetting > 2. tar/pax/cpio > 3. Regular Expressions > 4. pf (however, the exam is more on recognition than hands on) > 5. Permissions (octal, symbolic, chmod, umask, etc.) > 6. Basic Unix tasks (cron jobs, symbolic links, simple sh scripts, etc) > 7. rc and sysctl > > If you would like to volunteer, please attend the next NYCBUG meeting on June > 4th and come and find me so we can work out the details. > > Thanks > NYCBUG has been a great platform for people who have not previously spoken publicly start out. The topics are general enough for those looking to take the exam, yet not get bogged down in details. These talks, as Steve states, are short and to the point. Looking forward to Wednesday and getting things moving for the con. George From nycbug-list at 2xlp.com Mon Jun 2 18:57:51 2008 From: nycbug-list at 2xlp.com (Jonathan Vanasco) Date: Mon, 2 Jun 2008 18:57:51 -0400 Subject: [nycbug-talk] know OpenID VERY well? In-Reply-To: <5b5090780805191713sdcfd55cw63066f3a56b08954@mail.gmail.com> References: <5b5090780805191713sdcfd55cw63066f3a56b08954@mail.gmail.com> Message-ID: A few years ago I thought I spotted a security vulnerability in the design of the protocol. I've never had time to properly inspect. This is definitely an 'edge case' and caused by the implementation of OpenID, not a flaw in the protocol. If you know the protocol very well and have an open mind, please be in touch ( Most people who know OpenID are evangelists and outright dismiss any criticism ) if I'm right about this, we can author the paper + test case together. If I'm wrong about this, at least my nerves can be put at rest. You need to know OpenID REALLY well to confirm/laugh at my suspicions -- it has to do with the order of events and protocol requirements, and I could be 100% off about this. From spork at bway.net Mon Jun 2 23:04:50 2008 From: spork at bway.net (Charles Sprickman) Date: Mon, 2 Jun 2008 23:04:50 -0400 (EDT) Subject: [nycbug-talk] Nagios or...? In-Reply-To: References: Message-ID: I shall top-post my reply to myself... Short story: -ZenOSS looks interesting, I may keep it around and slowly populate it with more devices and services and see if it grows on me. -Nagios 3 seems like it will be much quicker to setup and I don't have to fiddle with any custom plugins I've created (temperature sensors for the $35 sensor kits, UPS status, various snmp-y router things). I can re-use more of the existing config than I thought. Long story: I'm always a little bit leery about any open source projects that have both a free/"community" edition as well as a paid/"enterprise" edition. I understand people have to eat and all, but I'm just not comfortable with it unless the project is really well-established. ZenOSS does look good and it's more advanced than Nagios. The web-driven config might drive me batty, but I'm not sure that's the only way to configure it - the manual is huge and I've only been hunting through there briefly. ZenOSS is also terribly Linux-biased; just installing it on FreeBSD is more of a pain in the ass than is necessary. They bundle in their own dependencies (Python, mysql libs, graphics libs, rrdtool, the whole mess) and build those. To build on FreeBSD (and likely other things non-linux) you have to let the lengthy build process bomb and then google for errors and find really old posts from other *BSD users pointing out bad linker flags, including the wrong headers, etc. - these answers are generally in their own forums, but some of the answers sit there for release after release and only get integrated into FAQs but not fixed in the actual software. The monitoring also omits all *BSDs but does of course include Windows and even Solaris hosts (ie: presets for various snmp items, power management stuff, etc.). As I said, I will continue to fiddle with it and see what I come up with - it will take a long time to replicate what I should be able to do with nagios with an hour here or there. Nagios 3 still looks very much like Nagios 1. Same web interface, but it seems like the config model has gotten better and even easier to script. There still seems to be talk of replacing the cgi's with php at some point in the future. I like that not because I'm a huge fan of php, but because I know enough of it to be able to hack it up - something I can't do with cgi's written in C. No graphing built-in, but I'm just starting to figure out which of the add-on packages is most apropos. The only thing I'm currently after with graphing is to have something to refer to after some event - being able to look at trends is always very helpful when troubleshooting. Of course ZenOSS does have very nice graphing out of the box. That's about it... One other thing I'd like to share is how I setup a test environment for this stuff. Everything I monitor is pretty well locked-down with host-based firewalls. I did not want to put ZenOSS nor a newer Nagios on the same host that's currently monitoring everything - cleaning up dependencies that were installed for testing, trying to get two different versions of Nagios working side-by-side, and the general confusion that could ensue was not something I wanted to deal with. I also didn't want to start changing cisco access-lists and firewall rules on a bunch of hosts to allow another host in for monitoring. Solution (yo, Ike!): Jails! I put a jail on my monitoring host and installed both new packages there. That solved all the above problems. The jail is NAT'd, so requests from the jail appear to come from the utility box. The jail environment is clean so I can keep track of what exactly has been installed and there's no conflicts with existing software. Here's just a few snippets of the pf rules to make the NAT magic happen: # 192.168.2.1 is the jail IP # test jail nat nat on $ext_if inet proto { tcp, udp, icmp } from 192.168.2.1/32 to any -> x.x.x.x # ext. IP # two redirects to get to ZenOSS and Nagios web interfaces rdr on $ext_if proto tcp from any to x.x.x.x port 8080 -> 192.168.2.1 port 8080 rdr on $ext_if proto tcp from any to x.x.x.x port 8090 -> 192.168.2.1 port 80 # rules to allow the redirected traffic pass in quick on $ext_if proto tcp from to any port 8080 flags S/SA keep state pass in quick on $ext_if proto tcp from to any port 8090 flags S/SA keep state Quick and dirty, but it works. Thanks again for all the input! Charles On Wed, 28 May 2008, Charles Sprickman wrote: > Hi all, > > I've still got some old (1.x) Nagios installs that basically work, but > have become a bit quirky. I started looking for info on upgrading and it > seems like the easiest path they've got is to start from scratch on the > new version. Since even that is a fair bit of work, I'm wondering what > else is out there that's comparable. > > Quite some time ago I installed Zabbix and it was a good example of what I > do not want. It was pretty much web-only config which was an extremely > inefficient way to enter more than a handful of devices. > > Something that would integrate graphing of some monitored items, ability > to export usage stats on some monitored services to billing, and some > pre-made/clonable templates for common devices/services would be my > pie in the sky solution. :) > > Thanks, > > Charles > > ___ > Charles Sprickman > NetEng/SysAdmin > Bway.net - New York's Best Internet - www.bway.net > spork at bway.net - 212.655.9344 > > _______________________________________________ > talk mailing list > talk at lists.nycbug.org > http://lists.nycbug.org/mailman/listinfo/talk > From jonathan at kc8onw.net Tue Jun 3 22:40:11 2008 From: jonathan at kc8onw.net (jonathan at kc8onw.net) Date: Tue, 3 Jun 2008 22:40:11 -0400 (EDT) Subject: [nycbug-talk] P2P traffic detection. Message-ID: <63308.80.91.220.50.1212547211.squirrel@www.kc8onw.net> Hello all, Does anyone know of any tools to detect bittorrent and other P2P traffic other than nessus or snort? I'm looking for something fairly simple but a bit more accurate and more automated than looking at iftop output and saying "gee that's suspicious" I've tried searching but I must be using the wrong terms as I have not been able to find anything useful so far. Jonathan Stewart From okan at demirmen.com Wed Jun 4 18:56:45 2008 From: okan at demirmen.com (Okan Demirmen) Date: Wed, 4 Jun 2008 18:56:45 -0400 Subject: [nycbug-talk] Breakout sessions at NYCBSDCon In-Reply-To: <20080601232858.GA96344@scruffy.exit2shell.com> References: <20080601232858.GA96344@scruffy.exit2shell.com> Message-ID: <20080604225645.GA1181@clam.khaoz.org> On Sun 2008.06.01 at 19:28 -0400, Steven Kreuzer wrote: > Greetings- > > In case you are not aware, we have been working with the BSD Certification team > to give an exam at the conference in October. > > We will have a classroom at the conference that will be used to hold breakout > sessions to help let people brush up on some of the > topics that will be covered on the exam. > > We are currently looking for individuals who can give 30 minute refresher > courses on the following topics: > > 1. Subnetting > 2. tar/pax/cpio > 3. Regular Expressions > 4. pf (however, the exam is more on recognition than hands on) > 5. Permissions (octal, symbolic, chmod, umask, etc.) > 6. Basic Unix tasks (cron jobs, symbolic links, simple sh scripts, etc) > 7. rc and sysctl > > If you would like to volunteer, please attend the next NYCBUG meeting on June > 4th and come and find me so we can work out the details. after (during actually) this evening's meeting, we have: bsd cert tutorials ------------------ 1. subnetting - mark 2. tar/pax/cpio - ? 3. regular expressions - ? 4. pf (however, the exam is more on recognition than hands on) - nikolai 5. permissions (octal, symbolic, chmod, umask, etc.) - ? 6. basic unix tasks (cron jobs, symbolic links, simple sh scripts, etc) - mike 7. rc and sysctl - ? 8. user administration - ivan feel free to volunteer ;) cheers, okan From nycbug at cyth.net Wed Jun 4 19:05:42 2008 From: nycbug at cyth.net (Ray Lai) Date: Wed, 4 Jun 2008 19:05:42 -0400 Subject: [nycbug-talk] Breakout sessions at NYCBSDCon In-Reply-To: <20080604225645.GA1181@clam.khaoz.org> References: <20080601232858.GA96344@scruffy.exit2shell.com> <20080604225645.GA1181@clam.khaoz.org> Message-ID: <7765c0380806041605x3dfc83b4mbb5d12477fb58da7@mail.gmail.com> On Wed, Jun 4, 2008 at 6:56 PM, Okan Demirmen wrote: > On Sun 2008.06.01 at 19:28 -0400, Steven Kreuzer wrote: >> Greetings- >> >> In case you are not aware, we have been working with the BSD Certification team >> to give an exam at the conference in October. >> >> We will have a classroom at the conference that will be used to hold breakout >> sessions to help let people brush up on some of the >> topics that will be covered on the exam. >> >> We are currently looking for individuals who can give 30 minute refresher >> courses on the following topics: >> >> 1. Subnetting >> 2. tar/pax/cpio >> 3. Regular Expressions >> 4. pf (however, the exam is more on recognition than hands on) >> 5. Permissions (octal, symbolic, chmod, umask, etc.) >> 6. Basic Unix tasks (cron jobs, symbolic links, simple sh scripts, etc) >> 7. rc and sysctl >> >> If you would like to volunteer, please attend the next NYCBUG meeting on June >> 4th and come and find me so we can work out the details. > > after (during actually) this evening's meeting, we have: > > bsd cert tutorials > ------------------ > 1. subnetting - mark > 2. tar/pax/cpio - ? > 3. regular expressions - ? > 4. pf (however, the exam is more on recognition than hands on) - nikolai > 5. permissions (octal, symbolic, chmod, umask, etc.) - ? > 6. basic unix tasks (cron jobs, symbolic links, simple sh scripts, etc) - mike > 7. rc and sysctl - ? > 8. user administration - ivan Damn, forgot about the meeting... um, I can do regexp and permissions, but how deep do you need to go for regexp? Nothing too fancy, I hope. -Ray- From okan at demirmen.com Wed Jun 4 19:23:01 2008 From: okan at demirmen.com (Okan Demirmen) Date: Wed, 4 Jun 2008 19:23:01 -0400 Subject: [nycbug-talk] Breakout sessions at NYCBSDCon In-Reply-To: <7765c0380806041605x3dfc83b4mbb5d12477fb58da7@mail.gmail.com> References: <20080601232858.GA96344@scruffy.exit2shell.com> <20080604225645.GA1181@clam.khaoz.org> <7765c0380806041605x3dfc83b4mbb5d12477fb58da7@mail.gmail.com> Message-ID: <20080604232300.GU21345@clam.khaoz.org> On Wed 2008.06.04 at 19:05 -0400, Ray Lai wrote: > > bsd cert tutorials > > ------------------ > > 1. subnetting - mark > > 2. tar/pax/cpio - ? > > 3. regular expressions - ? > > 4. pf (however, the exam is more on recognition than hands on) - nikolai > > 5. permissions (octal, symbolic, chmod, umask, etc.) - ? > > 6. basic unix tasks (cron jobs, symbolic links, simple sh scripts, etc) - mike > > 7. rc and sysctl - ? > > 8. user administration - ivan > > Damn, forgot about the meeting... um, I can do regexp and permissions, > but how deep do you need to go for regexp? Nothing too fancy, I hope. ok, cool - regex; nothing fancy ;) From bcully at gmail.com Wed Jun 4 19:24:02 2008 From: bcully at gmail.com (Brian Cully) Date: Wed, 4 Jun 2008 19:24:02 -0400 Subject: [nycbug-talk] Breakout sessions at NYCBSDCon In-Reply-To: <7765c0380806041605x3dfc83b4mbb5d12477fb58da7@mail.gmail.com> References: <20080601232858.GA96344@scruffy.exit2shell.com> <20080604225645.GA1181@clam.khaoz.org> <7765c0380806041605x3dfc83b4mbb5d12477fb58da7@mail.gmail.com> Message-ID: <64DCD35B-44DC-42AE-B7F8-B822462D261D@gmail.com> Honestly, for most people and purposes PCRE should be fine. Cover backtracking, greed, and references and you've hit why regexp is useful. -bjc On Jun 4, 2008, at 19:05, "Ray Lai" wrote: > On Wed, Jun 4, 2008 at 6:56 PM, Okan Demirmen > wrote: >> On Sun 2008.06.01 at 19:28 -0400, Steven Kreuzer wrote: >>> Greetings- >>> >>> In case you are not aware, we have been working with the BSD >>> Certification team >>> to give an exam at the conference in October. >>> >>> We will have a classroom at the conference that will be used to >>> hold breakout >>> sessions to help let people brush up on some of the >>> topics that will be covered on the exam. >>> >>> We are currently looking for individuals who can give 30 minute >>> refresher >>> courses on the following topics: >>> >>> 1. Subnetting >>> 2. tar/pax/cpio >>> 3. Regular Expressions >>> 4. pf (however, the exam is more on recognition than hands on) >>> 5. Permissions (octal, symbolic, chmod, umask, etc.) >>> 6. Basic Unix tasks (cron jobs, symbolic links, simple sh scripts, >>> etc) >>> 7. rc and sysctl >>> >>> If you would like to volunteer, please attend the next NYCBUG >>> meeting on June >>> 4th and come and find me so we can work out the details. >> >> after (during actually) this evening's meeting, we have: >> >> bsd cert tutorials >> ------------------ >> 1. subnetting - mark >> 2. tar/pax/cpio - ? >> 3. regular expressions - ? >> 4. pf (however, the exam is more on recognition than hands on) - >> nikolai >> 5. permissions (octal, symbolic, chmod, umask, etc.) - ? >> 6. basic unix tasks (cron jobs, symbolic links, simple sh scripts, >> etc) - mike >> 7. rc and sysctl - ? >> 8. user administration - ivan > > Damn, forgot about the meeting... um, I can do regexp and permissions, > but how deep do you need to go for regexp? Nothing too fancy, I hope. > > -Ray- > _______________________________________________ > talk mailing list > talk at lists.nycbug.org > http://lists.nycbug.org/mailman/listinfo/talk From contact at doostang.com Thu Jun 19 02:01:21 2008 From: contact at doostang.com (Huy Ton-That) Date: Thu, 19 Jun 2008 06:01:21 +0000 Subject: [nycbug-talk] I've added you as a friend on Doostang Message-ID: <4859f631220fe_1356155555588f0816728926@70955-30.70955.com.tmail> Hi, I?ve requested to add you as a friend on Doostang, an invite-only career community started at Harvard, Stanford, and MIT. You can use Doostang to find a job or internship, network, and access valuable career information from peers and industry professionals. Regards, Huy To accept this invitation, please visit: http://www.doostang.com/s/u?s=bzLsmLY8KW ----------------------------------------------------------- If you don't want to receive future invitations or emails from Doostang, click here: http://www.doostang.com/logins/noemail?arg=fb81ec55a27f8ac12f85b91a401d88e2f514effd From bonsaime at gmail.com Thu Jun 19 19:08:10 2008 From: bonsaime at gmail.com (Jesse Callaway) Date: Thu, 19 Jun 2008 19:08:10 -0400 Subject: [nycbug-talk] I've added you as a friend on Doostang In-Reply-To: <4859f631220fe_1356155555588f0816728926@70955-30.70955.com.tmail> References: <4859f631220fe_1356155555588f0816728926@70955-30.70955.com.tmail> Message-ID: Hmmm... seems like we need some Dooletpaper On Thu, Jun 19, 2008 at 2:01 AM, Huy Ton-That wrote: > Hi, > > I've requested to add you as a friend on Doostang, an invite-only career community started at Harvard, Stanford, and MIT. You can use Doostang to find a job or internship, network, and access valuable career information from peers and industry professionals. > > Regards, > Huy > > To accept this invitation, please visit: > http://www.doostang.com/s/u?s=bzLsmLY8KW > > ----------------------------------------------------------- > If you don't want to receive future invitations or emails from Doostang, click here: > http://www.doostang.com/logins/noemail?arg=fb81ec55a27f8ac12f85b91a401d88e2f514effd > _______________________________________________ > talk mailing list > talk at lists.nycbug.org > http://lists.nycbug.org/mailman/listinfo/talk > > From marco at metm.org Thu Jun 19 20:53:35 2008 From: marco at metm.org (Marco Scoffier) Date: Thu, 19 Jun 2008 20:53:35 -0400 Subject: [nycbug-talk] need help postfix, pf, jails, freebsd Message-ID: <485AFF8F.9040605@metm.org> Hi everyone, I need help. I have been swamped lately and I feel that I don't have the time to read up and do this properly so I would like to hire someone for a few hours to help me move a mail server and dns for several domains to a new server. The mailserver is postfix and I want to move it to a jail (FreeBSD) which is mounted on a local 127.0.0.255 address. The idea being that all routing to the jails is handled by pf on the jailhost. I would like to have imaps routed to the jail also (dovecot) and smtp-auth routed to this jail also. I have the jails set up and the pf routing working for other services. I have mail working using the IP address. I've been meaning to move the mailserver for a while, but I keep hesitating before switching the MX records for the domains involved, I'm mostly worried about losing mail if something goes wrong. If someone has been through this knows about the technologies involved and has a sane plan of attack. I would love to hire you for a few hours to sit down together and look over my shoulder while we get this done. I usually don't get cold feet when trying new things and changing stuff around, but I've a bit too swamped to deal lately so I thought I would ask for help. Thanks, -- Marco From chsnyder at gmail.com Fri Jun 20 07:08:15 2008 From: chsnyder at gmail.com (csnyder) Date: Fri, 20 Jun 2008 07:08:15 -0400 Subject: [nycbug-talk] I've added you as a friend on Doostang In-Reply-To: References: <4859f631220fe_1356155555588f0816728926@70955-30.70955.com.tmail> Message-ID: On Thu, Jun 19, 2008 at 7:08 PM, Jesse Callaway wrote: > Hmmm... seems like we need some Dooletpaper > ROTFL, yo. Doostang is right up there with expertsexchange for brilliant naming. From george at ceetonetechnology.com Fri Jun 20 21:25:43 2008 From: george at ceetonetechnology.com (George Rosamond) Date: Fri, 20 Jun 2008 21:25:43 -0400 Subject: [nycbug-talk] I've added you as a friend on Doostang In-Reply-To: References: <4859f631220fe_1356155555588f0816728926@70955-30.70955.com.tmail> Message-ID: <485C5897.7090503@ceetonetechnology.com> csnyder wrote: > On Thu, Jun 19, 2008 at 7:08 PM, Jesse Callaway wrote: >> Hmmm... seems like we need some Dooletpaper >> > > ROTFL, yo. Doostang is right up there with expertsexchange for brilliant naming. I also complimented JC offlist on that. . . I think it may be post of the year. George From spork at bway.net Tue Jun 24 02:08:48 2008 From: spork at bway.net (Charles Sprickman) Date: Tue, 24 Jun 2008 02:08:48 -0400 Subject: [nycbug-talk] Nagios or...? In-Reply-To: References: Message-ID: <1BD35CD1-31B0-4B36-9277-64B0FE820D32@bway.net> Talking to myself yet again (I told all this to the dog, but he showed a distinct lack of interest in the subject) in a top-post, I'll share a bit more info since I just cut over to Nagios 3.0.2 tonight. Moving the configs was actually quite simple. Looking at what was new and cleaning up the old configs to make everything clearer and to split config items into more appropriately-named object files was time- consuming, but well worth the work since the setup is much easier for me or anyone else looking at it to follow. The embedded perl interpreter and some of my plugins still don't get along, but I can live with that. ZenOSS is still in the jail, but I have not touched it in weeks. One unexpected bonus that ZenOSS motivated me to find was a graphing add-on. I like the idea of finding some service failing and being able to jump right to a number of graphs for that host - load, service response times, etc. When troubleshooting it's nice to have as much data as possible. After digging around a bit and even trying nagiosgraph for a bit, I installed "PNP4Nagios". It is amazing, works as promised, and getting all the basics going is very easy compared to the other graphing add-ons I found. I highly recommend anyone running Nagios 2.x or 3.x have a look at it. The one big hint is to replace "check_ping" with "check_icmp" in your commands.cfg. The latter gives performance data output which is necessary for pnp4nagios. Here's the PNP4Nagios site: http://www.pnp4nagios.org/pnp/start Now it's time to evaluate some of my other plugins and see what's new and exciting at nagiosexchange... Charles On Jun 2, 2008, at 11:04 PM, Charles Sprickman wrote: > I shall top-post my reply to myself... > > Short story: > > -ZenOSS looks interesting, I may keep it around and slowly populate > it with more devices and services and see if it grows on me. > -Nagios 3 seems like it will be much quicker to setup and I don't > have to fiddle with any custom plugins I've created (temperature > sensors for the $35 sensor kits, UPS status, various snmp-y router > things). I can re-use more of the existing config than I thought. > > Long story: > > I'm always a little bit leery about any open source projects that > have both a free/"community" edition as well as a paid/"enterprise" > edition. I understand people have to eat and all, but I'm just not > comfortable with it unless the project is really well-established. > ZenOSS does look good and it's more advanced than Nagios. The web- > driven config might drive me batty, but I'm not sure that's the only > way to configure it - the manual is huge and I've only been hunting > through there briefly. > > ZenOSS is also terribly Linux-biased; just installing it on FreeBSD > is more of a pain in the ass than is necessary. They bundle in > their own dependencies (Python, mysql libs, graphics libs, rrdtool, > the whole mess) and build those. To build on FreeBSD (and likely > other things non-linux) you have to let the lengthy build process > bomb and then google for errors and find really old posts from other > *BSD users pointing out bad linker flags, including the wrong > headers, etc. - these answers are generally in their own forums, but > some of the answers sit there for release after release and only get > integrated into FAQs but not fixed in the actual software. The > monitoring also omits all *BSDs but does of course include Windows > and even Solaris hosts (ie: presets for various snmp items, power > management stuff, etc.). > > As I said, I will continue to fiddle with it and see what I come up > with - it will take a long time to replicate what I should be able > to do with nagios with an hour here or there. > > Nagios 3 still looks very much like Nagios 1. Same web interface, > but it seems like the config model has gotten better and even easier > to script. There still seems to be talk of replacing the cgi's with > php at some point in the future. I like that not because I'm a huge > fan of php, but because I know enough of it to be able to hack it up > - something I can't do with cgi's written in C. No graphing built- > in, but I'm just starting to figure out which of the add-on packages > is most apropos. The only thing I'm currently after with graphing > is to have something to refer to after some event - being able to > look at trends is always very helpful when troubleshooting. Of > course ZenOSS does have very nice graphing out of the box. > > That's about it... One other thing I'd like to share is how I setup > a test environment for this stuff. Everything I monitor is pretty > well locked-down with host-based firewalls. I did not want to put > ZenOSS nor a newer Nagios on the same host that's currently > monitoring everything - cleaning up dependencies that were installed > for testing, trying to get two different versions of Nagios working > side-by-side, and the general confusion that could ensue was not > something I wanted to deal with. I also didn't want to start > changing cisco access-lists and firewall rules on a bunch of hosts > to allow another host in for monitoring. Solution (yo, Ike!): Jails! > > I put a jail on my monitoring host and installed both new packages > there. That solved all the above problems. The jail is NAT'd, so > requests from the jail appear to come from the utility box. The > jail environment is clean so I can keep track of what exactly has > been installed and there's no conflicts with existing software. > Here's just a few snippets of the pf rules to make the NAT magic > happen: > > # 192.168.2.1 is the jail IP > # test jail nat > nat on $ext_if inet proto { tcp, udp, icmp } from 192.168.2.1/32 to > any -> x.x.x.x # ext. IP > # two redirects to get to ZenOSS and Nagios web interfaces > rdr on $ext_if proto tcp from any to x.x.x.x port 8080 -> > 192.168.2.1 port 8080 > rdr on $ext_if proto tcp from any to x.x.x.x port 8090 -> > 192.168.2.1 port 80 > # rules to allow the redirected traffic > pass in quick on $ext_if proto tcp from to any port 8080 > flags S/SA keep state > pass in quick on $ext_if proto tcp from to any port 8090 > flags S/SA keep state > > Quick and dirty, but it works. > > Thanks again for all the input! > > Charles > > On Wed, 28 May 2008, Charles Sprickman wrote: > >> Hi all, >> >> I've still got some old (1.x) Nagios installs that basically work, >> but >> have become a bit quirky. I started looking for info on upgrading >> and it >> seems like the easiest path they've got is to start from scratch on >> the >> new version. Since even that is a fair bit of work, I'm wondering >> what >> else is out there that's comparable. >> >> Quite some time ago I installed Zabbix and it was a good example of >> what I >> do not want. It was pretty much web-only config which was an >> extremely >> inefficient way to enter more than a handful of devices. >> >> Something that would integrate graphing of some monitored items, >> ability >> to export usage stats on some monitored services to billing, and some >> pre-made/clonable templates for common devices/services would be my >> pie in the sky solution. :) >> >> Thanks, >> >> Charles >> >> ___ >> Charles Sprickman >> NetEng/SysAdmin >> Bway.net - New York's Best Internet - www.bway.net >> spork at bway.net - 212.655.9344 >> >> _______________________________________________ >> talk mailing list >> talk at lists.nycbug.org >> http://lists.nycbug.org/mailman/listinfo/talk >> From scottro at nyc.rr.com Tue Jun 24 02:25:26 2008 From: scottro at nyc.rr.com (Scott Robbins) Date: Tue, 24 Jun 2008 02:25:26 -0400 Subject: [nycbug-talk] Nagios Message-ID: <20080624062526.GA14937@mail.scottro.net> Charles, please keep posting about this. You're not really talking to yourself. We recently set up Nagios and Cacti (though not in conjunction--somewhat different purposes for different audiences.) I had known nothing about them, my boss said learn them, and gave me a couple of months to do it. However, he might decide that we want to start graphing Nagios results, and if so, these things you mention will be very handy. So, don't think you're talking to yourself. (Tell the dog we all said, Hi.) We're running version 2.x on CentOS, but all this is still quite relevant to me, and I'm grateful for it. (Sorry to have broken header threading--while trying to clean up my saved folder I accidentally deleted Charles' posts. I shouldn't do these things so late at night.) -- Scott Robbins PGP keyID EB3467D6 ( 1B48 077D 66F6 9DB0 FDC2 A409 FA54 EB34 67D6 ) gpg --keyserver pgp.mit.edu --recv-keys EB3467D6 Giles: What ever happened to Latin? At least when that made no sense, the church approved. From cba at groundworkopensource.com Tue Jun 24 16:28:14 2008 From: cba at groundworkopensource.com (Chris B. Anderson) Date: Tue, 24 Jun 2008 13:28:14 -0700 Subject: [nycbug-talk] Nagios or...? In-Reply-To: Message-ID: "...Something that would integrate graphing of some monitored items, ability to export usage stats on some monitored services to billing, and some pre-made/clonable templates for common devices/services would be my pie in the sky solution. :)" Have you looked at GroundWork Monitor? It's got Nagios, RRDtool, Nmap, etc. built right in, and has a framework layer for easy integration of other tools, open source or otherwise. http://www.groundworkopensource.com/community/ -----Original Message----- From: talk-bounces at lists.nycbug.org [mailto:talk-bounces at lists.nycbug.org] On Behalf Of Charles Sprickman Sent: Tuesday, May 27, 2008 11:19 PM To: talk at lists.nycbug.org Subject: [nycbug-talk] Nagios or...? Hi all, I've still got some old (1.x) Nagios installs that basically work, but have become a bit quirky. I started looking for info on upgrading and it seems like the easiest path they've got is to start from scratch on the new version. Since even that is a fair bit of work, I'm wondering what else is out there that's comparable. Quite some time ago I installed Zabbix and it was a good example of what I do not want. It was pretty much web-only config which was an extremely inefficient way to enter more than a handful of devices. Something that would integrate graphing of some monitored items, ability to export usage stats on some monitored services to billing, and some pre-made/clonable templates for common devices/services would be my pie in the sky solution. :) Thanks, Charles ___ Charles Sprickman NetEng/SysAdmin Bway.net - New York's Best Internet - www.bway.net spork at bway.net - 212.655.9344 _______________________________________________ talk mailing list talk at lists.nycbug.org http://lists.nycbug.org/mailman/listinfo/talk From kacanski_s at yahoo.com Tue Jun 24 19:49:15 2008 From: kacanski_s at yahoo.com (Aleksandar Kacanski) Date: Tue, 24 Jun 2008 16:49:15 -0700 (PDT) Subject: [nycbug-talk] Nagios or...? Message-ID: <942969.22434.qm@web53603.mail.re2.yahoo.com> ----- Original Message ---- From: Charles Sprickman To: talk at lists.nycbug.org Sent: Tuesday, June 24, 2008 2:08:48 AM Subject: Re: [nycbug-talk] Nagios or...? Talking to myself yet again (I told all this to the dog, but he showed a distinct lack of interest in the subject) in a top-post, I'll share a bit more info since I just cut over to Nagios 3.0.2 tonight. Moving the configs was actually quite simple. Looking at what was new and cleaning up the old configs to make everything clearer and to split config items into more appropriately-named object files was time- consuming, but well worth the work since the setup is much easier for me or anyone else looking at it to follow. The embedded perl interpreter and some of my plugins still don't get along, but I can live with that. ZenOSS is still in the jail, but I have not touched it in weeks. One unexpected bonus that ZenOSS motivated me to find was a graphing add-on. I like the idea of finding some service failing and being able to jump right to a number of graphs for that host - load, service response times, etc. When troubleshooting it's nice to have as much data as possible. After digging around a bit and even trying nagiosgraph for a bit, I installed "PNP4Nagios". It is amazing, works as promised, and getting all the basics going is very easy compared to the other graphing add-ons I found. I highly recommend anyone running Nagios 2.x or 3.x have a look at it. The one big hint is to replace "check_ping" with "check_icmp" in your commands.cfg. The latter gives performance data output which is necessary for pnp4nagios. Here's the PNP4Nagios site: http://www.pnp4nagios.org/pnp/start Now it's time to evaluate some of my other plugins and see what's new and exciting at nagiosexchange... Charles ----------- What is possibility for some paid work for building integration panel in python to combine and correlate information from snmp tools like cacti (excellent stuff), nagios3, snort ... --sasha . From skreuzer at exit2shell.com Wed Jun 25 11:36:59 2008 From: skreuzer at exit2shell.com (Steven Kreuzer) Date: Wed, 25 Jun 2008 11:36:59 -0400 Subject: [nycbug-talk] Some BSD Videos Message-ID: <20080625153659.GA67250@slurry.exit2shell.com> Thought I would pass this along: Howard Green, from The Business News, interviews Theo de Raadt. http://youtube.com/watch?v=W9fQa00CB9U Our hero Ike, at ShmooCon 2006, talking about jails http://www.shmoocon.org/2006/videos/Ike-Jail.mp4 -- Steven Kreuzer http://www.exit2shell.com/~skreuzer From matt at thehour.com Wed Jun 25 11:21:14 2008 From: matt at thehour.com (Matt Terenzio) Date: Wed, 25 Jun 2008 11:21:14 -0400 Subject: [nycbug-talk] Help mounting slave drive to restore Message-ID: Hi all, I recently made a stupid typing mistake: mv /usr/local/www/dir /* . Which it looks like put the /bin directory, the COPYRIGHT file and the entropy file into the directory I was in. Then, even stupider, logged out. Okay so, the hosting provider set the old hard-drive up as a slave and now I try to mount it, as root: First I made a mount point: mkdir old Then try to mount: mount -t ufs | /dev/ad0s** /dev/old And I get: /dev/ad0s1: Permission denied. Why is that? Thanks, Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From carton at Ivy.NET Wed Jun 25 12:11:29 2008 From: carton at Ivy.NET (Miles Nordin) Date: Wed, 25 Jun 2008 12:11:29 -0400 Subject: [nycbug-talk] Help mounting slave drive to restore In-Reply-To: (Matt Terenzio's message of "Wed, 25 Jun 2008 11:21:14 -0400") References: Message-ID: >>>>> "mt" == Matt Terenzio writes: mt> Then try to mount: mt> mount -t ufs | /dev/ad0s** /dev/old mt> /dev/ad0s1: Permission denied. mt> Why is that? well, it's not executable. # chmod u+x /dev/ad0s1 but, I don't understand how you could think a command like that is appropriate, which makes me scared for your data. I think you should stop, think about what you're trying to accomplish, and talk to some people before you do anything like this ever again. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: From matt at thehour.com Wed Jun 25 12:26:32 2008 From: matt at thehour.com (Matt MT. Terenzio) Date: Wed, 25 Jun 2008 12:26:32 -0400 Subject: [nycbug-talk] Help mounting slave drive to restore In-Reply-To: Message-ID: > mt> Then try to mount: > > mt> mount -t ufs | /dev/ad0s** /dev/old > > > > mt> /dev/ad0s1: Permission denied. > > mt> Why is that? > > well, it's not executable. > > # chmod u+x /dev/ad0s1 > > but, I don't understand how you could think a command like that is > appropriate, which makes me scared for your data. I think you should > stop, think about what you're trying to accomplish, and talk to some > people before you do anything like this ever again. My initial idea was to get back on to the old hard-drive reverse the mv command that moved the bin directory, then reboot with the old hard-drive as master again and hope it was working again. However, at this point I'll even accept just getting a copy of a few of the directories, like my database and web files. I gather this is not the way to do that. From matt at thehour.com Wed Jun 25 12:42:58 2008 From: matt at thehour.com (Matt MT. Terenzio) Date: Wed, 25 Jun 2008 12:42:58 -0400 Subject: [nycbug-talk] Help mounting slave drive to restore In-Reply-To: Message-ID: > My initial idea was to get back on to the old hard-drive reverse the mv > command that moved the bin directory, then reboot with the old hard-drive as > master again and hope it was working again. > I think all I really needed was for the hosting company to boot up with a Live CD so that I could get in and fix the problem. . . From tekronis at gmail.com Tue Jun 24 18:47:21 2008 From: tekronis at gmail.com (H. G.) Date: Tue, 24 Jun 2008 18:47:21 -0400 Subject: [nycbug-talk] Prelude IDS Message-ID: <60131f920806241547h5046dfefsc397a5c01651585b@mail.gmail.com> Has anyone given this one a shot? Gotten it to play nice with Nagios? -------------- next part -------------- An HTML attachment was scrubbed... URL: From matt at thehour.com Wed Jun 25 14:08:06 2008 From: matt at thehour.com (Matt MT. Terenzio) Date: Wed, 25 Jun 2008 14:08:06 -0400 Subject: [nycbug-talk] Xen Support for EC2 Message-ID: Anyone heard anything about progress with XEN support for the purpose of creating Amazon EC2 instances? I know I had heard that a key developer was a little too busy to do any heavy work on it, but that was months ago. From matt at thehour.com Wed Jun 25 20:26:36 2008 From: matt at thehour.com (Matt Terenzio) Date: Wed, 25 Jun 2008 20:26:36 -0400 Subject: [nycbug-talk] Help mounting slave drive to restore References: <60131f920806251702g692c8700x215e2f6ac6b34d39@mail.gmail.com> Message-ID: <33C9E6D4AA2E9741B021194BBF4AF40D70C587@thehourexchange.thehour.com> >H.G. said ...is that a pipe ( " | " ) I see in there? Is that intentional, or a typo...? I've been made aware that the command is insane. I didn't make it up. Just followed and possibly(probably?) misinterpreted some bad advice. I'm a fool. Not for not knowing, but for blindly executing without good advice. ; ) One interesting side note: I've been running FreeBSD servers for years and years for web-hosting projects and never had occasion to use the mount command. My whole universe is a unix-like filsystem on servers I never physically touch, so why would I, except in a catastrophe like this. FreeBSD has made it that easy! CD /usr/ports/. . . make install Either that, or I've had incredible luck. (BTW, it hasn't always been that easy, just saying that I'm no SysAdmin but I manage to get along) And, if I move to a virtual environment, like economics is pushing me toward, I'll even have less concerns about hardware. But I'll still be a fool. . . -------------- next part -------------- An HTML attachment was scrubbed... URL: From tekronis at gmail.com Wed Jun 25 20:02:44 2008 From: tekronis at gmail.com (H. G.) Date: Wed, 25 Jun 2008 20:02:44 -0400 Subject: [nycbug-talk] Help mounting slave drive to restore In-Reply-To: References: Message-ID: <60131f920806251702g692c8700x215e2f6ac6b34d39@mail.gmail.com> On Wed, Jun 25, 2008 at 11:21 AM, Matt Terenzio wrote: > Hi all, > > I recently made a stupid typing mistake: > > mv /usr/local/www/dir /* . > > Which it looks like put the /bin directory, the COPYRIGHT file and the > entropy file into the directory I was in. > > Then, even stupider, logged out. > > Okay so, the hosting provider set the old hard-drive up as a slave and now > I > try to mount it, as root: > > First I made a mount point: > > mkdir old > > Then try to mount: > > mount -t ufs | /dev/ad0s** /dev/old > > ..... ...is that a pipe ( " | " ) I see in there? Is that intentional, or a typo...? -------------- next part -------------- An HTML attachment was scrubbed... URL: From matt at thehour.com Wed Jun 25 20:46:43 2008 From: matt at thehour.com (Matt Terenzio) Date: Wed, 25 Jun 2008 20:46:43 -0400 Subject: [nycbug-talk] Help mounting slave drive to restore References: <60131f920806251702g692c8700x215e2f6ac6b34d39@mail.gmail.com> Message-ID: <33C9E6D4AA2E9741B021194BBF4AF40D70C588@thehourexchange.thehour.com> > > mount -t ufs | /dev/ad0s** /dev/old > > ..... ...is that a pipe ( " | " ) I see in there? Is that intentional, or a typo...? Just for the record, can anyone affirm how to correctly mount a slave HDD? Would: mount /dev/ad0s1 /old work, assuming I've created a directory /old? -------------- next part -------------- An HTML attachment was scrubbed... URL: From pete at nomadlogic.org Wed Jun 25 21:09:22 2008 From: pete at nomadlogic.org (pete) Date: Wed, 25 Jun 2008 21:09:22 -0400 Subject: [nycbug-talk] Help mounting slave drive to restore In-Reply-To: <33C9E6D4AA2E9741B021194BBF4AF40D70C588@thehourexchange.thehour.com> References: <60131f920806251702g692c8700x215e2f6ac6b34d39@mail.gmail.com> <33C9E6D4AA2E9741B021194BBF4AF40D70C588@thehourexchange.thehour.com> Message-ID: On Wed, 25 Jun 2008 20:46:43 -0400, "Matt Terenzio" wrote: > >> >> mount -t ufs | /dev/ad0s** /dev/old >> >> ..... > > ...is that a pipe ( " | " ) I see in there? Is that intentional, or a > typo...? > > > Just for the record, can anyone affirm how to correctly mount a slave HDD? > > Would: > > mount /dev/ad0s1 /old > > work, assuming I've created a directory /old? matt - FreeBSD has excellent documentation, I'd really suggest reading the FreeBSD handbook as well as the "mount" man pages. here's something to get you started: http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/mount-unmount.html -pete -- Pete Wright pete at nomadlogic.org 310.869.9459 From andy.kosela at gmail.com Thu Jun 26 02:37:03 2008 From: andy.kosela at gmail.com (Andy Kosela) Date: Thu, 26 Jun 2008 08:37:03 +0200 Subject: [nycbug-talk] Some BSD Videos In-Reply-To: <20080625153659.GA67250@slurry.exit2shell.com> References: <20080625153659.GA67250@slurry.exit2shell.com> Message-ID: <3cc535c80806252337k3e5e0b79l74e3f61edf1385cc@mail.gmail.com> On Wed, Jun 25, 2008 at 5:36 PM, Steven Kreuzer wrote: > Thought I would pass this along: > > Howard Green, from The Business News, interviews Theo de Raadt. > http://youtube.com/watch?v=W9fQa00CB9U > > Our hero Ike, at ShmooCon 2006, talking about jails > http://www.shmoocon.org/2006/videos/Ike-Jail.mp4 and here is also something interesting: http://video.google.com/videoplay?docid=4820358584118058355&hl=en A lil bit of history of UNIX.. notice the young Bill Joy there. -- Andy Kosela ora et labora From mikel.king at olivent.com Thu Jun 26 11:01:12 2008 From: mikel.king at olivent.com (Mikel King) Date: Thu, 26 Jun 2008 11:01:12 -0400 Subject: [nycbug-talk] mail server product info Message-ID: <56B282DC-01BF-46F7-A50D-D42889893B4F@olivent.com> Hi all. I have been tasked by a client to investigate mail server products, and discover if it is possible to have a specific footer appended automatically to all of their out bound email. I know that mailman and other mailing list systems have something along the lines that they are requesting but I am am wondering if it is possible for something like postfix to do this with some special configuration. Or would this be a roll your own kind of solution? To make matters worse this client has several distinct domain names and they would like the footer to be customized and unique for each domain. Any thoughts, comments and ideas would be greatly appreciated. Regards, Mikel From george at ceetonetechnology.com Thu Jun 26 11:15:01 2008 From: george at ceetonetechnology.com (George Rosamond) Date: Thu, 26 Jun 2008 11:15:01 -0400 Subject: [nycbug-talk] mail server product info In-Reply-To: <56B282DC-01BF-46F7-A50D-D42889893B4F@olivent.com> References: <56B282DC-01BF-46F7-A50D-D42889893B4F@olivent.com> Message-ID: <4863B275.6060905@ceetonetechnology.com> Mikel King wrote: > Hi all. I have been tasked by a client to investigate mail server > products, and discover if it is possible to have a specific footer > appended automatically to all of their out bound email. I know that > mailman and other mailing list systems have something along the lines > that they are requesting but I am am wondering if it is possible for > something like postfix to do this with some special configuration. Or > would this be a roll your own kind of solution? > > To make matters worse this client has several distinct domain names > and they would like the footer to be customized and unique for each > domain. > > Any thoughts, comments and ideas would be greatly appreciated. > I don't have time to look this up. .. but I've done it. . . From what I remember, you have to set a different local port in the /usr/local/etc/postfix/main.cf for SMTP. Search for postfix and disclaimer . . . lots of stuff. George From ike at lesmuug.org Sun Jun 29 19:11:36 2008 From: ike at lesmuug.org (Isaac Levy) Date: Sun, 29 Jun 2008 19:11:36 -0400 Subject: [nycbug-talk] ZFS and firewire - conditions for a perfect storm Message-ID: <55CF45DB-B144-45B9-8AF2-AD12D479B1E7@lesmuug.org> Hi All, This story could have been avoided, but... I had a fairly 'perfect storm' of data loss at home, thought I'd share with the list. I figure perhaps someone here could suggest a plan of action to fix this long-term? If you don't care about firewire, feel free to skip this message- I've replicated *everything below* with a SATA controller and ZFS/RAID-Z works as expected, flawlessly. (ZFS is really astounding on FreeBSD!) -- I've been a big fan (and little user) of ZFS for about a year now, it's of course excellent. With that, I built a *very* cheap 2nd file server from gear I had, - A mini-PC (no PCI slots, almost laptop specs) - 4 Firewire drives (long-time mac user, have the cases) - daisy-chained setup, no Firewire hub - ZFS RAID-Z (FreeBSD 7, was running HEAD now REL) This rig worked so well, that it quickly became my primary SMB workhorse. A mac with an apple software raid became the backup system, (rsync over SSH rocks for me). The firewire drives cases are metal, and act as a heat-sinc, so no noisy fans for home use :) The firewire drives seem to have issues hanging up ZFS if one fails, but I figured *what the heck, it's got a backup on an alltogether different filesystem*, and at least I don't ever have to fsck 4tb volumes! Whee! This has been a very productive tag-team for my uses- until now... -- The perfect storm: I needed to pull 2 drives from the mac, therefore I shut down SMB on the ZFS server. I rebuilt the RAID on the mac, and brought it back online. - I went to immediately begin copying files back from the ZFS machine, - A physical drive enclosure power board shorted out (perhaps the fatal moment?) - The FreeBSD system stayed up, all I/O to the ZFS volumes would just hang- - df reported the volumes still mounted - saw the firewire drive dissappear in /var/log/messages, but, - 'zpool status' reported that ALL WAS FINE I freaked out, and did nothing to the system but watch it (and chain smoke) for 2 hours- hoping perhaps that something would move. All disk I/O to the ZFS volume stayed in a hung state. Finally, I rebooted the system- and it hung during the shutdown sequence. I sat with it for another 30 minutes, hoping something would change, to no avail. I finally power-cycled the machine, (perhaps the fatal moment?) Before it came back up, I pulled the dead drive out of the pool, expecting the ZFS volume to come back online in a RAID-Z degraded state- *I found the entire ZFS volume was hosed (insert wailing and gnashing of teeth here): [root at blackowl /usr/home/ike]# zpool status pool: Z state: FAULTED status: One or more devices could not be used because the label is missing or invalid. There are insufficient replicas for the pool to continue functioning. action: Destroy and re-create the pool from a backup source. see: http://www.sun.com/msg/ZFS-8000-5E scrub: none requested config: NAME STATE READ WRITE CKSUM Z FAULTED 0 0 0 corrupted data da0 FAULTED 0 0 0 corrupted data da1 FAULTED 0 0 0 corrupted data da2 FAULTED 0 0 0 corrupted data da3 ONLINE 0 0 0 [root at blackowl /usr/home/ike]# Anyhow, I have a few theories about what happened, and some subsequent tests. 1) The firewire bus could possibly be loosing track of which device is which- and confusing ZFS. In my daisy-chain setup, when one drive in the chain dies, (say, da2), and it's removed from the chain, it seems to become the previous drive (e.g. da1). However, this may not exactly be the case- because when I remove the last drive from the chain, I still destroy the RAID-Z pool, and it still hangs as expected above. ZFS definately gets confused when I take an existing daisy-chain ZFS pool, and reboot the machine with all drives plugged into a firewire hub: [root at blackowl /usr/home/ike]# zpool status pool: Z state: UNAVAIL status: One or more devices could not be used because the label is missing or invalid. There are insufficient replicas for the pool to continue functioning. action: Destroy and re-create the pool from a backup source. see: http://www.sun.com/msg/ZFS-8000-5E scrub: none requested config: NAME STATE READ WRITE CKSUM Z UNAVAIL 0 0 0 insufficient replicas raidz1 UNAVAIL 0 0 0 insufficient replicas da0 FAULTED 0 0 0 corrupted data da1 FAULTED 0 0 0 corrupted data da2 FAULTED 0 0 0 corrupted data da3 FAULTED 0 0 0 corrupted data [root at blackowl /usr/home/ike]# As an aside, for those not familiar with using Firewire and Apple/OSX stuff: this is not the case on a Mac with HFS+ volumes, they always retain their identity regardless of how/where they are plugged in... Stupid simple. 2) The firewire driver may not send the right signal in/to the kernel when a drive fails or is removed, or perhaps ZFS may not pick it up correctly. This scenario would explain the hanging and subsequent data corruption, when the last drive is simply unplugged from the chain. (My expected results would be simply an 'UNAVAIL' ZFS device, and proper RAID-Z degraded functionality). -- Resolution/Conclusions: 1) Don't bank on Firewire/ZFS using FreeBSD. Apple hasn't even sorted the issues out for OSX... :) (Anyone know about OpenSolaris/Firewire/ ZFS? How's that for esoteric :) ZFS ROCKS HARD in my experience using SATA and well-worn (and even cheap) SATA controllers, so for the future, I'll be sticking to the MASSIVE amount of that hardware out there. 2) Should I email a FreeBSD dev list with this, and if so, which one? Firewire? ZFS? 3) Should I email a Sun/ZFS list with this, and if so, does anyone know that scene? Rocket- .ike From ike at lesmuug.org Sun Jun 29 19:40:59 2008 From: ike at lesmuug.org (Isaac Levy) Date: Sun, 29 Jun 2008 19:40:59 -0400 Subject: [nycbug-talk] ZFS and firewire - conditions for a perfect storm In-Reply-To: <55CF45DB-B144-45B9-8AF2-AD12D479B1E7@lesmuug.org> References: <55CF45DB-B144-45B9-8AF2-AD12D479B1E7@lesmuug.org> Message-ID: <9EF067C3-8C92-49CC-BBE3-827FD89C148E@lesmuug.org> A quick adendum, for the record on list, On Jun 29, 2008, at 7:11 PM, Isaac Levy wrote: > Hi All, > > This story could have been avoided, but... I had a fairly 'perfect > storm' of data loss at home, thought I'd share with the list. > > I figure perhaps someone here could suggest a plan of action to fix > this long-term? > > If you don't care about firewire, feel free to skip this message- I've > replicated *everything below* with a SATA controller and ZFS/RAID-Z > works as expected, flawlessly. (ZFS is really astounding on FreeBSD!) > > > -- > I've been a big fan (and little user) of ZFS for about a year now, > it's of course excellent. With that, I built a *very* cheap 2nd file > server from gear I had, > > - A mini-PC (no PCI slots, almost laptop specs) > - 4 Firewire drives (long-time mac user, have the cases) > - daisy-chained setup, no Firewire hub > - ZFS RAID-Z (FreeBSD 7, was running HEAD now REL) > > This rig worked so well, that it quickly became my primary SMB > workhorse. A mac with an apple software raid became the backup > system, (rsync over SSH rocks for me). The firewire drives cases are > metal, and act as a heat-sinc, so no noisy fans for home use :) > > The firewire drives seem to have issues hanging up ZFS if one fails, > but I figured *what the heck, it's got a backup on an alltogether > different filesystem*, and at least I don't ever have to fsck 4tb > volumes! Whee! This has been a very productive tag-team for my uses- > until now... > > > -- > The perfect storm: > > I needed to pull 2 drives from the mac, therefore I shut down SMB on > the ZFS server. I rebuilt the RAID on the mac, and brought it back > online. > > - I went to immediately begin copying files back from the ZFS machine, > > - A physical drive enclosure power board shorted out (perhaps the > fatal moment?) > > - The FreeBSD system stayed up, all I/O to the ZFS volumes would just > hang- > - df reported the volumes still mounted > - saw the firewire drive dissappear in /var/log/messages, but, > - 'zpool status' reported that ALL WAS FINE > > I freaked out, and did nothing to the system but watch it (and chain > smoke) for 2 hours- hoping perhaps that something would move. All > disk I/O to the ZFS volume stayed in a hung state. > > Finally, I rebooted the system- and it hung during the shutdown > sequence. I sat with it for another 30 minutes, hoping something > would change, to no avail. > I finally power-cycled the machine, (perhaps the fatal moment?) > > Before it came back up, I pulled the dead drive out of the pool, > expecting the ZFS volume to come back online in a RAID-Z degraded > state- > *I found the entire ZFS volume was hosed (insert wailing and gnashing > of teeth here): > > [root at blackowl /usr/home/ike]# zpool status > pool: Z > state: FAULTED > status: One or more devices could not be used because the label is > missing > or invalid. There are insufficient replicas for the pool to continue > functioning. > action: Destroy and re-create the pool from a backup source. > see: http://www.sun.com/msg/ZFS-8000-5E > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > Z FAULTED 0 0 0 corrupted data > da0 FAULTED 0 0 0 corrupted data > da1 FAULTED 0 0 0 corrupted data > da2 FAULTED 0 0 0 corrupted data > da3 ONLINE 0 0 0 > [root at blackowl /usr/home/ike]# > > > > > Anyhow, I have a few theories about what happened, and some subsequent > tests. > > 1) The firewire bus could possibly be loosing track of which device is > which- and confusing ZFS. In my daisy-chain setup, when one drive in > the chain dies, (say, da2), and it's removed from the chain, it seems > to become the previous drive (e.g. da1). Further experiments: I tried creating and torturing a setup with the drives connected to a firewire hub, and had *slightly* different results- (still not good): - Taking 1 drive offline, simply hung *any* process trying to access files on the ZFS volumes. + HOWEVER, contrary to the daisy-chain setup, bringing the drive back online FREED THE HUNG SYSTEM and all went on normally! Samba transfers in progress however, failed- but a client still had it's SMB mount, and all continued to operate normally after the drive was back in. So it simply seems this part of the problem is a issue (er, bug?) relating to a given Firewire device physical ID, and how it is presented to the kernel. Yet, one more test: - I took 1 drive from the RAID-Z pool offline, then rebooted the system. My expected result was that it would come back like a degraded (yet operational) RAID-Z volme, but instead I got: [root at blackowl /usr/home/ike]# zpool status pool: Z state: UNAVAIL status: One or more devices could not be opened. There are insufficient replicas for the pool to continue functioning. action: Attach the missing device and online it using 'zpool online'. see: http://www.sun.com/msg/ZFS-8000-D3 scrub: none requested config: NAME STATE READ WRITE CKSUM Z UNAVAIL 0 0 0 insufficient replicas raidz1 UNAVAIL 0 0 0 insufficient replicas da0 FAULTED 0 0 0 corrupted data da1 ONLINE 0 0 0 da2 FAULTED 0 0 0 corrupted data da3 UNAVAIL 0 0 0 cannot open [root at blackowl /usr/home/ike]# So, hrmph. > > However, this may not exactly be the case- because when I remove the > last drive from the chain, I still destroy the RAID-Z pool, and it > still hangs as expected above. > > ZFS definately gets confused when I take an existing daisy-chain ZFS > pool, and reboot the machine with all drives plugged into a firewire > hub: > > [root at blackowl /usr/home/ike]# zpool status > pool: Z > state: UNAVAIL > status: One or more devices could not be used because the label is > missing > or invalid. There are insufficient replicas for the pool to continue > functioning. > action: Destroy and re-create the pool from a backup source. > see: http://www.sun.com/msg/ZFS-8000-5E > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > Z UNAVAIL 0 0 0 insufficient replicas > raidz1 UNAVAIL 0 0 0 insufficient replicas > da0 FAULTED 0 0 0 corrupted data > da1 FAULTED 0 0 0 corrupted data > da2 FAULTED 0 0 0 corrupted data > da3 FAULTED 0 0 0 corrupted data > [root at blackowl /usr/home/ike]# > > As an aside, for those not familiar with using Firewire and Apple/OSX > stuff: this is not the case on a Mac with HFS+ volumes, they always > retain their identity regardless of how/where they are plugged in... > Stupid simple. > > > 2) The firewire driver may not send the right signal in/to the kernel > when a drive fails or is removed, or perhaps ZFS may not pick it up > correctly. This scenario would explain the hanging and subsequent > data corruption, when the last drive is simply unplugged from the > chain. > (My expected results would be simply an 'UNAVAIL' ZFS device, and > proper RAID-Z degraded functionality). > > > -- > Resolution/Conclusions: > > 1) Don't bank on Firewire/ZFS using FreeBSD. Apple hasn't even sorted > the issues out for OSX... :) (Anyone know about OpenSolaris/Firewire/ > ZFS? How's that for esoteric :) > > ZFS ROCKS HARD in my experience using SATA and well-worn (and even > cheap) SATA controllers, so for the future, I'll be sticking to the > MASSIVE amount of that hardware out there. > > 2) Should I email a FreeBSD dev list with this, and if so, which one? > Firewire? ZFS? > > 3) Should I email a Sun/ZFS list with this, and if so, does anyone > know that scene? > > > Rocket- > .ike What follows here are tailings of syslogs, for anyone who's as interested in this issue as I am: ## zpool status from just after the catastrophic moment described above: [root at blackowl /usr/home/ike]# zpool status pool: Z state: FAULTED status: One or more devices could not be used because the label is missing or invalid. There are insufficient replicas for the pool to continue functioning. action: Destroy and re-create the pool from a backup source. see: http://www.sun.com/msg/ZFS-8000-5E scrub: none requested config: NAME STATE READ WRITE CKSUM Z FAULTED 0 0 0 corrupted data da0 FAULTED 0 0 0 corrupted data da1 FAULTED 0 0 0 corrupted data da2 FAULTED 0 0 0 corrupted data da3 ONLINE 0 0 0 [root at blackowl /usr/home/ike]# ## tailings from /var/log/messages, from the moments before/after the event described above: (fwohci0, firewire0 are the firewire parts, da0-3 are the firewire disks) Jun 27 19:00:01 blackowl newsyslog[1385]: logfile turned over due to size>100K Jun 27 19:00:39 blackowl sudo: ike : TTY=ttyp0 ; PWD=/Z/shared/ Movies ; USER=root ; COMMAND=/sbin/zfs list -r -t snapshot -o name,creation Z/shared Jun 27 19:01:12 blackowl sudo: ike : TTY=ttyp0 ; PWD=/Z/shared/ Movies ; USER=root ; COMMAND=/sbin/zfs destroy Z/ shared at shuffle_ouroboros_disks Jun 27 19:08:01 blackowl sudo: ike : TTY=ttyp0 ; PWD=/Z/shared ; USER=root ; COMMAND=/bin/mkdir CaseSensitive.txt Jun 27 19:08:31 blackowl sudo: ike : TTY=ttyp0 ; PWD=/Z/shared ; USER=root ; COMMAND=/bin/rm -rf CaseSensitive.txt Jun 27 19:08:37 blackowl sudo: ike : TTY=ttyp0 ; PWD=/Z/shared ; USER=root ; COMMAND=/bin/mkdir CaseSensitive Jun 27 19:08:46 blackowl sudo: ike : TTY=ttyp0 ; PWD=/Z/shared ; USER=root ; COMMAND=/usr/sbin/chown ike:ike CaseSensitive Jun 27 19:09:05 blackowl kernel: fwohci0: too many cycle lost, no cycle master presents? Jun 27 19:12:15 blackowl sudo: ike : TTY=unknown ; PWD=/usr/home/ ike ; USER=root ; COMMAND=/usr/local/bin/rsync --server --sender - vlogDtprz . /Z/ Jun 27 19:24:29 blackowl sshd[1466]: error: ssh_msg_send: write Jun 27 19:24:41 blackowl sudo: ike : TTY=unknown ; PWD=/usr/home/ ike ; USER=root ; COMMAND=/usr/local/bin/rsync --server --sender - vlogDtprz . /Z/ Jun 27 19:28:12 blackowl sshd[1475]: error: PAM: authentication error for ike from 192.168.1.87 Jun 27 19:28:15 blackowl sudo: ike : TTY=unknown ; PWD=/usr/home/ ike ; USER=root ; COMMAND=/usr/local/bin/rsync --server --sender - vlogDtprz . /Z/ Jun 27 19:31:14 blackowl sudo: ike : TTY=ttyp0 ; PWD=/Z/jails ; USER=root ; COMMAND=/bin/rm -rf supercore.local/COPYRIGHT supercore.local/bin supercore.local/boot supercore.local/dev supercore.local/etc supercore.local/home supercore.local/lib supercore.local/libexec supercore.local/media supercore.local/mnt supercore.local/proc supercore.local/rescue supercore.local/root supercore.local/sbin supercore.local/sys supercore.local/tmp supercore.local/usr supercore.local/var Jun 27 19:31:48 blackowl sudo: ike : TTY=unknown ; PWD=/usr/home/ ike ; USER=root ; COMMAND=/usr/local/bin/rsync --server --sender - vlogDtprz . /Z/ Jun 27 21:04:23 blackowl ntpd[998]: kernel time sync enabled 6001 Jun 27 21:08:47 blackowl ntpd[998]: kernel time sync enabled 2001 Jun 27 22:03:39 blackowl sudo: ike : TTY=ttyp2 ; PWD=/usr/home/ ike ; USER=root ; COMMAND=/usr/bin/vi /etc/ntp.conf Jun 27 22:40:25 blackowl sshd[1886]: error: PAM: authentication error for ike from 192.168.1.75 Jun 27 22:40:25 blackowl last message repeated 2 times Jun 27 22:40:31 blackowl sudo: ike : TTY=unknown ; PWD=/usr/home/ ike ; USER=root ; COMMAND=/usr/local/bin/rsync --server --sender - vlogDtprz . /Z/shared/ Jun 27 22:54:45 blackowl sudo: ike : TTY=ttyp3 ; PWD=/usr/home/ ike ; USER=root ; COMMAND=/usr/local/bin/bash Jun 28 04:43:09 blackowl sudo: ike : TTY=unknown ; PWD=/usr/home/ ike ; USER=root ; COMMAND=/usr/local/bin/rsync --server --sender - vlogDtprz . /Z/ Jun 28 04:45:31 blackowl sudo: ike : TTY=ttyp5 ; PWD=/xsdisk ; USER=root ; COMMAND=/usr/local/bin/bash Jun 28 12:01:57 blackowl kernel: fwohci0: BUS reset Jun 28 12:01:57 blackowl kernel: fwohci0: node_id=0xc800ffc2, gen=2, CYCLEMASTER mode Jun 28 12:01:57 blackowl kernel: fwohci0: txd err=14 ack busy_X Jun 28 12:01:57 blackowl kernel: sbp_orb_pointer_callback: xfer->resp = 16 Jun 28 12:01:57 blackowl kernel: fwohci0: txd err= 0 No stat Jun 28 12:01:57 blackowl kernel: sbp_orb_pointer_callback: xfer->resp = 22 Jun 28 12:01:57 blackowl kernel: firewire0: 3 nodes, maxhop <= 2, cable IRM = 2 (me) Jun 28 12:01:57 blackowl kernel: firewire0: bus manager 2 (me) Jun 28 12:23:14 blackowl sudo: ike : TTY=unknown ; PWD=/usr/home/ ike ; USER=root ; COMMAND=/usr/local/bin/rsync --server --sender - vlogDtprz . /Z/shared/books/ Jun 28 14:57:51 blackowl sudo: ike : TTY=ttyp1 ; PWD=/usr/home/ ike ; USER=root ; COMMAND=/usr/bin/killall -9 rsync Jun 28 14:57:54 blackowl last message repeated 3 times Jun 28 14:58:08 blackowl acpi: resumed at 20080628 14:58:08 Jun 28 14:59:31 blackowl syslogd: kernel boot file is /boot/kernel/ kernel Jun 28 14:59:31 blackowl kernel: Copyright (c) 1992-2008 The FreeBSD Project. Jun 28 14:59:31 blackowl kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 Jun 28 14:59:31 blackowl kernel: The Regents of the University of California. All rights reserved. Jun 28 14:59:31 blackowl kernel: FreeBSD is a registered trademark of The FreeBSD Foundation. Jun 28 14:59:31 blackowl kernel: FreeBSD 7.0-RELEASE-p2 #4: Wed Jun 25 16:38:06 EDT 2008 Jun 28 14:59:31 blackowl kernel: root at blackowl.local:/usr/obj/usr/src/ sys/IKEKERNEL-JAN-2008 Jun 28 14:59:31 blackowl kernel: Timecounter "i8254" frequency 1193182 Hz quality 0 Jun 28 14:59:31 blackowl kernel: CPU: Genuine Intel(R) CPU T2500 @ 2.00GHz (2000.00-MHz 686-class CPU) Jun 28 14:59:31 blackowl kernel: Origin = "GenuineIntel" Id = 0x6e8 Stepping = 8 Jun 28 14:59:31 blackowl kernel: Features = 0xbfe9fbff < FPU ,VME ,DE ,PSE ,TSC ,MSR ,PAE ,MCE ,CX8 ,APIC ,SEP ,MTRR,PGE,MCA,CMOV,PAT,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Jun 28 14:59:31 blackowl kernel: Features2=0xc1a9 Jun 28 14:59:31 blackowl kernel: AMD Features=0x100000 Jun 28 14:59:31 blackowl kernel: Cores per package: 2 Jun 28 14:59:31 blackowl kernel: real memory = 2137915392 (2038 MB) Jun 28 14:59:31 blackowl kernel: avail memory = 2082533376 (1986 MB) Jun 28 14:59:31 blackowl kernel: ACPI APIC Table: Jun 28 14:59:31 blackowl kernel: FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs Jun 28 14:59:31 blackowl kernel: cpu0 (BSP): APIC ID: 0 Jun 28 14:59:31 blackowl kernel: cpu1 (AP): APIC ID: 1 Jun 28 14:59:31 blackowl kernel: ioapic0: Changing APIC ID to 2 Jun 28 14:59:31 blackowl kernel: ioapic0 irqs 0-23 on motherboard Jun 28 14:59:31 blackowl kernel: kbd1 at kbdmux0 Jun 28 14:59:31 blackowl kernel: ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) Jun 28 14:59:31 blackowl kernel: hptrr: HPT RocketRAID controller driver v1.1 (Jun 25 2008 16:37:50) Jun 28 14:59:31 blackowl kernel: acpi0: on motherboard Jun 28 14:59:31 blackowl kernel: acpi0: [ITHREAD] Jun 28 14:59:31 blackowl kernel: acpi0: Power Button (fixed) Jun 28 14:59:31 blackowl kernel: acpi0: reservation of 0, a0000 (3) failed Jun 28 14:59:31 blackowl kernel: acpi0: reservation of 100000, 7f5e0000 (3) failed Jun 28 14:59:31 blackowl kernel: Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 Jun 28 14:59:31 blackowl kernel: acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0 Jun 28 14:59:31 blackowl kernel: cpu0: on acpi0 Jun 28 14:59:31 blackowl kernel: ACPI Error (psparse-0626): Method parse/execution failed [\_PR_.CPU0._OSC] (Node 0xc52013a0), AE_ALREADY_EXISTS Jun 28 14:59:31 blackowl kernel: est0: on cpu0 Jun 28 14:59:31 blackowl kernel: p4tcc0: on cpu0 Jun 28 14:59:31 blackowl kernel: cpu1: on acpi0 Jun 28 14:59:31 blackowl kernel: ACPI Error (psparse-0626): Method parse/execution failed [\_PR_.CPU1._OSC] (Node 0xc5201300), AE_ALREADY_EXISTS Jun 28 14:59:31 blackowl kernel: est1: on cpu1 Jun 28 14:59:31 blackowl kernel: p4tcc1: on cpu1 Jun 28 14:59:31 blackowl kernel: acpi_button0: on acpi0 Jun 28 14:59:31 blackowl kernel: pcib0: port 0xcf8-0xcff on acpi0 Jun 28 14:59:31 blackowl kernel: pci0: on pcib0 Jun 28 14:59:31 blackowl kernel: vgapci0: port 0xff00-0xff07 mem 0xfde80000-0xfdefffff,0xd0000000-0xdfffffff, 0xfdf80000-0xfdfbffff irq 16 at device 2.0 on pci0 Jun 28 14:59:31 blackowl kernel: agp0: on vgapci0 Jun 28 14:59:31 blackowl kernel: agp0: detected 7932k stolen memory Jun 28 14:59:31 blackowl kernel: agp0: aperture size is 256M Jun 28 14:59:31 blackowl kernel: vgapci1: mem 0xfdf00000-0xfdf7ffff at device 2.1 on pci0 Jun 28 14:59:31 blackowl kernel: pci0: at device 27.0 (no driver attached) Jun 28 14:59:31 blackowl kernel: pcib1: irq 16 at device 28.0 on pci0 Jun 28 14:59:31 blackowl kernel: pci1: on pcib1 Jun 28 14:59:31 blackowl kernel: pcib2: irq 17 at device 28.1 on pci0 Jun 28 14:59:31 blackowl kernel: pci2: on pcib2 Jun 28 14:59:31 blackowl kernel: em0: port 0xef00-0xef1f mem 0xfdde0000-0xfddfffff irq 17 at device 0.0 on pci2 Jun 28 14:59:31 blackowl kernel: em0: Using MSI interrupt Jun 28 14:59:31 blackowl kernel: em0: Ethernet address: 00:01:80:66:90:7a Jun 28 14:59:31 blackowl kernel: em0: [FILTER] Jun 28 14:59:31 blackowl kernel: uhci0: port 0xfe00-0xfe1f irq 23 at device 29.0 on pci0 Jun 28 14:59:31 blackowl kernel: uhci0: [GIANT-LOCKED] Jun 28 14:59:31 blackowl kernel: uhci0: [ITHREAD] Jun 28 14:59:31 blackowl kernel: usb0: on uhci0 Jun 28 14:59:31 blackowl kernel: usb0: USB revision 1.0 Jun 28 14:59:31 blackowl kernel: uhub0: on usb0 Jun 28 14:59:31 blackowl kernel: uhub0: 2 ports with 2 removable, self powered Jun 28 14:59:31 blackowl kernel: uhci1: port 0xfd00-0xfd1f irq 19 at device 29.1 on pci0 Jun 28 14:59:31 blackowl kernel: uhci1: [GIANT-LOCKED] Jun 28 14:59:31 blackowl kernel: uhci1: [ITHREAD] Jun 28 14:59:31 blackowl kernel: usb1: on uhci1 Jun 28 14:59:31 blackowl kernel: usb1: USB revision 1.0 Jun 28 14:59:31 blackowl kernel: uhub1: on usb1 Jun 28 14:59:31 blackowl kernel: uhub1: 2 ports with 2 removable, self powered Jun 28 14:59:31 blackowl kernel: uhci2: port 0xfc00-0xfc1f irq 18 at device 29.2 on pci0 Jun 28 14:59:31 blackowl kernel: uhci2: [GIANT-LOCKED] Jun 28 14:59:31 blackowl kernel: uhci2: [ITHREAD] Jun 28 14:59:31 blackowl kernel: usb2: on uhci2 Jun 28 14:59:31 blackowl kernel: usb2: USB revision 1.0 Jun 28 14:59:31 blackowl kernel: uhub2: on usb2 Jun 28 14:59:31 blackowl kernel: uhub2: 2 ports with 2 removable, self powered Jun 28 14:59:31 blackowl kernel: uhci3: port 0xfb00-0xfb1f irq 16 at device 29.3 on pci0 Jun 28 14:59:31 blackowl kernel: uhci3: [GIANT-LOCKED] Jun 28 14:59:31 blackowl kernel: uhci3: [ITHREAD] Jun 28 14:59:31 blackowl kernel: usb3: on uhci3 Jun 28 14:59:31 blackowl kernel: usb3: USB revision 1.0 Jun 28 14:59:31 blackowl kernel: uhub3: on usb3 Jun 28 14:59:31 blackowl kernel: uhub3: 2 ports with 2 removable, self powered Jun 28 14:59:31 blackowl kernel: ehci0: mem 0xfdfff000-0xfdfff3ff irq 23 at device 29.7 on pci0 Jun 28 14:59:31 blackowl kernel: ehci0: [GIANT-LOCKED] Jun 28 14:59:31 blackowl kernel: ehci0: [ITHREAD] Jun 28 14:59:31 blackowl kernel: usb4: EHCI version 1.0 Jun 28 14:59:31 blackowl kernel: usb4: companion controllers, 2 ports each: usb0 usb1 usb2 usb3 Jun 28 14:59:31 blackowl kernel: usb4: on ehci0 Jun 28 14:59:31 blackowl kernel: usb4: USB revision 2.0 Jun 28 14:59:31 blackowl kernel: uhub4: on usb4 Jun 28 14:59:31 blackowl kernel: uhub4: 8 ports with 8 removable, self powered Jun 28 14:59:31 blackowl kernel: pcib3: at device 30.0 on pci0 Jun 28 14:59:31 blackowl kernel: pci3: on pcib3 Jun 28 14:59:31 blackowl kernel: fwohci0: mem 0xfdaff000-0xfdafffff irq 19 at device 3.0 on pci3 Jun 28 14:59:31 blackowl kernel: fwohci0: [FILTER] Jun 28 14:59:31 blackowl kernel: fwohci0: OHCI version 1.0 (ROM=1) Jun 28 14:59:31 blackowl kernel: fwohci0: No. of Isochronous channels is 8. Jun 28 14:59:31 blackowl kernel: fwohci0: EUI64 00:01:80:13:94:66:90:7a Jun 28 14:59:31 blackowl kernel: fwohci0: Phy 1394a available S400, 2 ports. Jun 28 14:59:31 blackowl kernel: fwohci0: Link S400, max_rec 2048 bytes. Jun 28 14:59:31 blackowl kernel: firewire0: on fwohci0 Jun 28 14:59:31 blackowl kernel: dcons_crom0: on firewire0 Jun 28 14:59:31 blackowl kernel: dcons_crom0: bus_addr 0x7d018000 Jun 28 14:59:31 blackowl kernel: fwe0: on firewire0 Jun 28 14:59:31 blackowl kernel: if_fwe0: Fake Ethernet address: 02:01:80:66:90:7a Jun 28 14:59:31 blackowl kernel: fwe0: Ethernet address: 02:01:80:66:90:7a Jun 28 14:59:31 blackowl kernel: fwip0: on firewire0 Jun 28 14:59:31 blackowl kernel: fwip0: Firewire address: 00:01:80:13:94:66:90:7a @ 0xfffe00000000, S400, maxrec 2048 Jun 28 14:59:31 blackowl kernel: sbp0: on firewire0 Jun 28 14:59:31 blackowl kernel: fwohci0: Initiate bus reset Jun 28 14:59:31 blackowl kernel: fwohci0: BUS reset Jun 28 14:59:31 blackowl kernel: fwohci0: node_id=0xc800ffc2, gen=1, CYCLEMASTER mode Jun 28 14:59:31 blackowl kernel: pci3: at device 4.0 (no driver attached) Jun 28 14:59:31 blackowl kernel: isab0: at device 31.0 on pci0 Jun 28 14:59:31 blackowl kernel: isa0: on isab0 Jun 28 14:59:31 blackowl kernel: atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xfa00-0xfa0f at device 31.1 on pci0 Jun 28 14:59:31 blackowl kernel: ata0: on atapci0 Jun 28 14:59:31 blackowl kernel: ata0: [ITHREAD] Jun 28 14:59:31 blackowl kernel: ata1: on atapci0 Jun 28 14:59:31 blackowl kernel: ata1: [ITHREAD] Jun 28 14:59:31 blackowl kernel: atapci1: port 0xf900-0xf907,0xf800-0xf803,0xf700-0xf707,0xf600-0xf603,0xf500-0xf50f mem 0xfdffe000-0xfdffe3ff irq 19 at device 31.2 on pci0 Jun 28 14:59:31 blackowl kernel: atapci1: [ITHREAD] Jun 28 14:59:31 blackowl kernel: ata2: on atapci1 Jun 28 14:59:31 blackowl kernel: ata2: [ITHREAD] Jun 28 14:59:31 blackowl kernel: ata3: on atapci1 Jun 28 14:59:31 blackowl kernel: ata3: [ITHREAD] Jun 28 14:59:31 blackowl kernel: pci0: at device 31.3 (no driver attached) Jun 28 14:59:31 blackowl kernel: pmtimer0 on isa0 Jun 28 14:59:31 blackowl kernel: atkbdc0: at port 0x60,0x64 on isa0 Jun 28 14:59:31 blackowl kernel: atkbd0: irq 1 on atkbdc0 Jun 28 14:59:31 blackowl kernel: kbd0 at atkbd0 Jun 28 14:59:31 blackowl kernel: atkbd0: [GIANT-LOCKED] Jun 28 14:59:31 blackowl kernel: atkbd0: [ITHREAD] Jun 28 14:59:31 blackowl kernel: ppc0: parallel port not found. Jun 28 14:59:31 blackowl kernel: sc0: at flags 0x100 on isa0 Jun 28 14:59:31 blackowl kernel: sc0: VGA <16 virtual consoles, flags=0x300> Jun 28 14:59:31 blackowl kernel: sio0: configured irq 4 not in bitmap of probed irqs 0 Jun 28 14:59:31 blackowl kernel: sio0: port may not be enabled Jun 28 14:59:31 blackowl kernel: sio0: configured irq 4 not in bitmap of probed irqs 0 Jun 28 14:59:31 blackowl kernel: sio0: port may not be enabled Jun 28 14:59:31 blackowl kernel: sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 Jun 28 14:59:31 blackowl kernel: sio0: type 8250 or not responding Jun 28 14:59:31 blackowl kernel: sio0: [FILTER] Jun 28 14:59:31 blackowl kernel: sio1: configured irq 3 not in bitmap of probed irqs 0 Jun 28 14:59:31 blackowl kernel: sio1: port may not be enabled Jun 28 14:59:31 blackowl kernel: vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Jun 28 14:59:31 blackowl kernel: Timecounters tick every 1.000 msec Jun 28 14:59:31 blackowl kernel: firewire0: 3 nodes, maxhop <= 2, cable IRM = 2 (me) Jun 28 14:59:31 blackowl kernel: firewire0: bus manager 2 (me) Jun 28 14:59:31 blackowl kernel: hptrr: no controller detected. Jun 28 14:59:31 blackowl kernel: acd0: CDRW at ata0-slave UDMA33 Jun 28 14:59:31 blackowl kernel: ad4: 114473MB at ata2-master SATA150 Jun 28 14:59:31 blackowl kernel: firewire0: New S400 device ID: 00303c02e0126d83 Jun 28 14:59:31 blackowl kernel: firewire0: New S400 device ID: 00303c02e0126d6b Jun 28 14:59:31 blackowl kernel: SMP: AP CPU #1 Launched! Jun 28 14:59:31 blackowl kernel: da0 at sbp0 bus 0 target 0 lun 0 Jun 28 14:59:31 blackowl kernel: da0: Fixed Direct Access SCSI-4 device Jun 28 14:59:31 blackowl kernel: da0: 50.000MB/s transfers Jun 28 14:59:31 blackowl kernel: da0: 953869MB (1953525168 512 byte sectors: 255H 63S/T 121601C) Jun 28 14:59:31 blackowl kernel: da1 at sbp0 bus 0 target 1 lun 0 Jun 28 14:59:31 blackowl kernel: da1: Fixed Direct Access SCSI-4 device Jun 28 14:59:31 blackowl kernel: da1: 50.000MB/s transfers Jun 28 14:59:31 blackowl kernel: da1: 953869MB (1953525168 512 byte sectors: 255H 63S/T 121601C) Jun 28 14:59:31 blackowl kernel: Trying to mount root from ufs:/dev/ ad4s1a Jun 28 14:59:31 blackowl kernel: WARNING: / was not properly dismounted Jun 28 14:59:31 blackowl kernel: WARNING: /tmp was not properly dismounted Jun 28 14:59:31 blackowl kernel: WARNING: /usr was not properly dismounted Jun 28 14:59:31 blackowl kernel: WARNING: /var was not properly dismounted Jun 28 14:59:31 blackowl kernel: /var: mount pending error: blocks 16 files 6 Jun 28 14:59:31 blackowl kernel: WARNING: /xsdisk was not properly dismounted Jun 28 14:59:31 blackowl kernel: WARNING: ZFS is considered to be an experimental feature in FreeBSD. Jun 28 14:59:31 blackowl kernel: ZFS filesystem version 6 Jun 28 14:59:31 blackowl kernel: ZFS storage pool version 6 Jun 28 14:59:31 blackowl kernel: link_elf: symbol atm_event undefined Jun 28 14:59:31 blackowl kernel: KLD if_en.ko: depends on utopia - not available Jun 28 14:59:31 blackowl kernel: link_elf: symbol atm_event undefined Jun 28 14:59:31 blackowl kernel: KLD if_en.ko: depends on utopia - not available Jun 28 14:59:31 blackowl kernel: link_elf: symbol atm_event undefined Jun 28 14:59:31 blackowl kernel: KLD if_en.ko: depends on utopia - not available Jun 28 14:59:31 blackowl savecore: no dumps found Jun 28 14:59:33 blackowl kernel: en0: link state changed to UP Jun 28 14:59:36 blackowl acpi: resumed at 20080628 14:59:36 Jun 28 14:59:37 blackowl syslogd: exiting on signal 15 Jun 28 15:01:03 blackowl syslogd: kernel boot file is /boot/kernel/ kernel Jun 28 15:01:03 blackowl kernel: Copyright (c) 1992-2008 The FreeBSD Project. Jun 28 15:01:03 blackowl kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 Jun 28 15:01:03 blackowl kernel: The Regents of the University of California. All rights reserved. Jun 28 15:01:03 blackowl kernel: FreeBSD is a registered trademark of The FreeBSD Foundation. Jun 28 15:01:03 blackowl kernel: FreeBSD 7.0-RELEASE-p2 #4: Wed Jun 25 16:38:06 EDT 2008 Jun 28 15:01:03 blackowl kernel: root at blackowl.local:/usr/obj/usr/src/ sys/IKEKERNEL-JAN-2008 Jun 28 15:01:03 blackowl kernel: Timecounter "i8254" frequency 1193182 Hz quality 0 Jun 28 15:01:03 blackowl kernel: CPU: Genuine Intel(R) CPU T2500 @ 2.00GHz (2000.00-MHz 686-class CPU) Jun 28 15:01:03 blackowl kernel: Origin = "GenuineIntel" Id = 0x6e8 Stepping = 8 Jun 28 15:01:03 blackowl kernel: Features = 0xbfe9fbff < FPU ,VME ,DE ,PSE ,TSC ,MSR ,PAE ,MCE ,CX8 ,APIC ,SEP ,MTRR,PGE,MCA,CMOV,PAT,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Jun 28 15:01:03 blackowl kernel: Features2=0xc1a9 Jun 28 15:01:03 blackowl kernel: AMD Features=0x100000 Jun 28 15:01:03 blackowl kernel: Cores per package: 2 Jun 28 15:01:03 blackowl kernel: real memory = 2137915392 (2038 MB) Jun 28 15:01:03 blackowl kernel: avail memory = 2082533376 (1986 MB) Jun 28 15:01:03 blackowl kernel: ACPI APIC Table: Jun 28 15:01:03 blackowl kernel: FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs Jun 28 15:01:03 blackowl kernel: cpu0 (BSP): APIC ID: 0 Jun 28 15:01:03 blackowl kernel: cpu1 (AP): APIC ID: 1 Jun 28 15:01:03 blackowl kernel: ioapic0: Changing APIC ID to 2 Jun 28 15:01:03 blackowl kernel: ioapic0 irqs 0-23 on motherboard Jun 28 15:01:03 blackowl kernel: kbd1 at kbdmux0 Jun 28 15:01:03 blackowl kernel: ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) Jun 28 15:01:03 blackowl kernel: hptrr: HPT RocketRAID controller driver v1.1 (Jun 25 2008 16:37:50) Jun 28 15:01:03 blackowl kernel: acpi0: on motherboard Jun 28 15:01:03 blackowl kernel: acpi0: [ITHREAD] Jun 28 15:01:03 blackowl kernel: acpi0: Power Button (fixed) Jun 28 15:01:03 blackowl kernel: acpi0: reservation of 0, a0000 (3) failed Jun 28 15:01:03 blackowl kernel: acpi0: reservation of 100000, 7f5e0000 (3) failed Jun 28 15:01:03 blackowl kernel: Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 Jun 28 15:01:03 blackowl kernel: acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0 Jun 28 15:01:03 blackowl kernel: cpu0: on acpi0 Jun 28 15:01:03 blackowl kernel: ACPI Error (psparse-0626): Method parse/execution failed [\_PR_.CPU0._OSC] (Node 0xc52013a0), AE_ALREADY_EXISTS Jun 28 15:01:03 blackowl kernel: est0: on cpu0 Jun 28 15:01:03 blackowl kernel: p4tcc0: on cpu0 Jun 28 15:01:03 blackowl kernel: cpu1: on acpi0 Jun 28 15:01:03 blackowl kernel: ACPI Error (psparse-0626): Method parse/execution failed [\_PR_.CPU1._OSC] (Node 0xc5201300), AE_ALREADY_EXISTS Jun 28 15:01:03 blackowl kernel: est1: on cpu1 Jun 28 15:01:03 blackowl kernel: p4tcc1: on cpu1 Jun 28 15:01:03 blackowl kernel: acpi_button0: on acpi0 Jun 28 15:01:03 blackowl kernel: pcib0: port 0xcf8-0xcff on acpi0 Jun 28 15:01:03 blackowl kernel: pci0: on pcib0 Jun 28 15:01:03 blackowl kernel: vgapci0: port 0xff00-0xff07 mem 0xfde80000-0xfdefffff,0xd0000000-0xdfffffff, 0xfdf80000-0xfdfbffff irq 16 at device 2.0 on pci0 Jun 28 15:01:03 blackowl kernel: agp0: on vgapci0 Jun 28 15:01:03 blackowl kernel: agp0: detected 7932k stolen memory Jun 28 15:01:03 blackowl kernel: agp0: aperture size is 256M Jun 28 15:01:03 blackowl kernel: vgapci1: mem 0xfdf00000-0xfdf7ffff at device 2.1 on pci0 Jun 28 15:01:03 blackowl kernel: pci0: at device 27.0 (no driver attached) Jun 28 15:01:03 blackowl kernel: pcib1: irq 16 at device 28.0 on pci0 Jun 28 15:01:03 blackowl kernel: pci1: on pcib1 Jun 28 15:01:03 blackowl kernel: pcib2: irq 17 at device 28.1 on pci0 Jun 28 15:01:03 blackowl kernel: pci2: on pcib2 Jun 28 15:01:03 blackowl kernel: em0: port 0xef00-0xef1f mem 0xfdde0000-0xfddfffff irq 17 at device 0.0 on pci2 Jun 28 15:01:03 blackowl kernel: em0: Using MSI interrupt Jun 28 15:01:03 blackowl kernel: em0: Ethernet address: 00:01:80:66:90:7a Jun 28 15:01:03 blackowl kernel: em0: [FILTER] Jun 28 15:01:03 blackowl kernel: uhci0: port 0xfe00-0xfe1f irq 23 at device 29.0 on pci0 Jun 28 15:01:03 blackowl kernel: uhci0: [GIANT-LOCKED] Jun 28 15:01:03 blackowl kernel: uhci0: [ITHREAD] Jun 28 15:01:03 blackowl kernel: usb0: on uhci0 Jun 28 15:01:03 blackowl kernel: usb0: USB revision 1.0 Jun 28 15:01:03 blackowl kernel: uhub0: on usb0 Jun 28 15:01:03 blackowl kernel: uhub0: 2 ports with 2 removable, self powered Jun 28 15:01:03 blackowl kernel: uhci1: port 0xfd00-0xfd1f irq 19 at device 29.1 on pci0 Jun 28 15:01:03 blackowl kernel: uhci1: [GIANT-LOCKED] Jun 28 15:01:03 blackowl kernel: uhci1: [ITHREAD] Jun 28 15:01:03 blackowl kernel: usb1: on uhci1 Jun 28 15:01:03 blackowl kernel: usb1: USB revision 1.0 Jun 28 15:01:03 blackowl kernel: uhub1: on usb1 Jun 28 15:01:03 blackowl kernel: uhub1: 2 ports with 2 removable, self powered Jun 28 15:01:03 blackowl kernel: uhci2: port 0xfc00-0xfc1f irq 18 at device 29.2 on pci0 Jun 28 15:01:03 blackowl kernel: uhci2: [GIANT-LOCKED] Jun 28 15:01:03 blackowl kernel: uhci2: [ITHREAD] Jun 28 15:01:03 blackowl kernel: usb2: on uhci2 Jun 28 15:01:03 blackowl kernel: usb2: USB revision 1.0 Jun 28 15:01:03 blackowl kernel: uhub2: on usb2 Jun 28 15:01:03 blackowl kernel: uhub2: 2 ports with 2 removable, self powered Jun 28 15:01:03 blackowl kernel: uhci3: port 0xfb00-0xfb1f irq 16 at device 29.3 on pci0 Jun 28 15:01:03 blackowl kernel: uhci3: [GIANT-LOCKED] Jun 28 15:01:03 blackowl kernel: uhci3: [ITHREAD] Jun 28 15:01:03 blackowl kernel: usb3: on uhci3 Jun 28 15:01:03 blackowl kernel: usb3: USB revision 1.0 Jun 28 15:01:03 blackowl kernel: uhub3: on usb3 Jun 28 15:01:03 blackowl kernel: uhub3: 2 ports with 2 removable, self powered Jun 28 15:01:03 blackowl kernel: ehci0: mem 0xfdfff000-0xfdfff3ff irq 23 at device 29.7 on pci0 Jun 28 15:01:03 blackowl kernel: ehci0: [GIANT-LOCKED] Jun 28 15:01:03 blackowl kernel: ehci0: [ITHREAD] Jun 28 15:01:03 blackowl kernel: usb4: EHCI version 1.0 Jun 28 15:01:03 blackowl kernel: usb4: companion controllers, 2 ports each: usb0 usb1 usb2 usb3 Jun 28 15:01:03 blackowl kernel: usb4: on ehci0 Jun 28 15:01:03 blackowl kernel: usb4: USB revision 2.0 Jun 28 15:01:03 blackowl kernel: uhub4: on usb4 Jun 28 15:01:03 blackowl kernel: uhub4: 8 ports with 8 removable, self powered Jun 28 15:01:03 blackowl kernel: pcib3: at device 30.0 on pci0 Jun 28 15:01:03 blackowl kernel: pci3: on pcib3 Jun 28 15:01:03 blackowl kernel: fwohci0: mem 0xfdaff000-0xfdafffff irq 19 at device 3.0 on pci3 Jun 28 15:01:03 blackowl kernel: fwohci0: [FILTER] Jun 28 15:01:03 blackowl kernel: fwohci0: OHCI version 1.0 (ROM=1) Jun 28 15:01:03 blackowl kernel: fwohci0: No. of Isochronous channels is 8. Jun 28 15:01:03 blackowl kernel: fwohci0: EUI64 00:01:80:13:94:66:90:7a Jun 28 15:01:03 blackowl kernel: fwohci0: Phy 1394a available S400, 2 ports. Jun 28 15:01:03 blackowl kernel: fwohci0: Link S400, max_rec 2048 bytes. Jun 28 15:01:03 blackowl kernel: firewire0: on fwohci0 Jun 28 15:01:03 blackowl kernel: dcons_crom0: on firewire0 Jun 28 15:01:03 blackowl kernel: dcons_crom0: bus_addr 0x7d018000 Jun 28 15:01:03 blackowl kernel: fwe0: on firewire0 Jun 28 15:01:03 blackowl kernel: if_fwe0: Fake Ethernet address: 02:01:80:66:90:7a Jun 28 15:01:03 blackowl kernel: fwe0: Ethernet address: 02:01:80:66:90:7a Jun 28 15:01:03 blackowl kernel: fwip0: on firewire0 Jun 28 15:01:03 blackowl kernel: fwip0: Firewire address: 00:01:80:13:94:66:90:7a @ 0xfffe00000000, S400, maxrec 2048 Jun 28 15:01:03 blackowl kernel: sbp0: on firewire0 Jun 28 15:01:03 blackowl kernel: fwohci0: Initiate bus reset Jun 28 15:01:03 blackowl kernel: fwohci0: BUS reset Jun 28 15:01:03 blackowl kernel: fwohci0: node_id=0xc800ffc2, gen=1, CYCLEMASTER mode Jun 28 15:01:03 blackowl kernel: pci3: at device 4.0 (no driver attached) Jun 28 15:01:03 blackowl kernel: isab0: at device 31.0 on pci0 Jun 28 15:01:03 blackowl kernel: isa0: on isab0 Jun 28 15:01:03 blackowl kernel: atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xfa00-0xfa0f at device 31.1 on pci0 Jun 28 15:01:03 blackowl kernel: ata0: on atapci0 Jun 28 15:01:03 blackowl kernel: ata0: [ITHREAD] Jun 28 15:01:03 blackowl kernel: ata1: on atapci0 Jun 28 15:01:03 blackowl kernel: ata1: [ITHREAD] Jun 28 15:01:03 blackowl kernel: atapci1: port 0xf900-0xf907,0xf800-0xf803,0xf700-0xf707,0xf600-0xf603,0xf500-0xf50f mem 0xfdffe000-0xfdffe3ff irq 19 at device 31.2 on pci0 Jun 28 15:01:03 blackowl kernel: atapci1: [ITHREAD] Jun 28 15:01:03 blackowl kernel: ata2: on atapci1 Jun 28 15:01:03 blackowl kernel: ata2: [ITHREAD] Jun 28 15:01:03 blackowl kernel: ata3: on atapci1 Jun 28 15:01:03 blackowl kernel: ata3: [ITHREAD] Jun 28 15:01:03 blackowl kernel: pci0: at device 31.3 (no driver attached) Jun 28 15:01:03 blackowl kernel: pmtimer0 on isa0 Jun 28 15:01:03 blackowl kernel: atkbdc0: at port 0x60,0x64 on isa0 Jun 28 15:01:03 blackowl kernel: atkbd0: irq 1 on atkbdc0 Jun 28 15:01:03 blackowl kernel: kbd0 at atkbd0 Jun 28 15:01:03 blackowl kernel: atkbd0: [GIANT-LOCKED] Jun 28 15:01:03 blackowl kernel: atkbd0: [ITHREAD] Jun 28 15:01:03 blackowl kernel: ppc0: parallel port not found. Jun 28 15:01:03 blackowl kernel: sc0: at flags 0x100 on isa0 Jun 28 15:01:03 blackowl kernel: sc0: VGA <16 virtual consoles, flags=0x300> Jun 28 15:01:03 blackowl kernel: sio0: configured irq 4 not in bitmap of probed irqs 0 Jun 28 15:01:03 blackowl kernel: sio0: port may not be enabled Jun 28 15:01:03 blackowl kernel: sio0: configured irq 4 not in bitmap of probed irqs 0 Jun 28 15:01:03 blackowl kernel: sio0: port may not be enabled Jun 28 15:01:03 blackowl kernel: sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 Jun 28 15:01:03 blackowl kernel: sio0: type 8250 or not responding Jun 28 15:01:03 blackowl kernel: sio0: [FILTER] Jun 28 15:01:03 blackowl kernel: sio1: configured irq 3 not in bitmap of probed irqs 0 Jun 28 15:01:03 blackowl kernel: sio1: port may not be enabled Jun 28 15:01:03 blackowl kernel: vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Jun 28 15:01:03 blackowl kernel: Timecounters tick every 1.000 msec Jun 28 15:01:03 blackowl kernel: firewire0: 3 nodes, maxhop <= 2, cable IRM = 2 (me) Jun 28 15:01:03 blackowl kernel: firewire0: bus manager 2 (me) Jun 28 15:01:03 blackowl kernel: hptrr: no controller detected. Jun 28 15:01:03 blackowl kernel: acd0: CDRW at ata0-slave UDMA33 Jun 28 15:01:03 blackowl kernel: ad4: 114473MB at ata2-master SATA150 Jun 28 15:01:03 blackowl kernel: firewire0: New S400 device ID: 00303c02e0126d83 Jun 28 15:01:03 blackowl kernel: firewire0: New S400 device ID: 00303c02e0126d6b Jun 28 15:01:03 blackowl kernel: SMP: AP CPU #1 Launched! Jun 28 15:01:03 blackowl kernel: da0 at sbp0 bus 0 target 0 lun 0 Jun 28 15:01:03 blackowl kernel: da0: Fixed Direct Access SCSI-4 device Jun 28 15:01:03 blackowl kernel: da0: 50.000MB/s transfers Jun 28 15:01:03 blackowl kernel: da0: 953869MB (1953525168 512 byte sectors: 255H 63S/T 121601C) Jun 28 15:01:03 blackowl kernel: da1 at sbp0 bus 0 target 1 lun 0 Jun 28 15:01:03 blackowl kernel: da1: Fixed Direct Access SCSI-4 device Jun 28 15:01:03 blackowl kernel: da1: 50.000MB/s transfers Jun 28 15:01:03 blackowl kernel: da1: 953869MB (1953525168 512 byte sectors: 255H 63S/T 121601C) Jun 28 15:01:03 blackowl kernel: Trying to mount root from ufs:/dev/ ad4s1a Jun 28 15:01:03 blackowl kernel: WARNING: / was not properly dismounted Jun 28 15:01:03 blackowl kernel: WARNING: /tmp was not properly dismounted Jun 28 15:01:03 blackowl kernel: WARNING: /usr was not properly dismounted Jun 28 15:01:03 blackowl kernel: WARNING: /var was not properly dismounted Jun 28 15:01:03 blackowl kernel: /var: mount pending error: blocks 48 files 12 Jun 28 15:01:03 blackowl kernel: WARNING: /xsdisk was not properly dismounted Jun 28 15:01:03 blackowl kernel: WARNING: ZFS is considered to be an experimental feature in FreeBSD. Jun 28 15:01:03 blackowl kernel: ZFS filesystem version 6 Jun 28 15:01:03 blackowl kernel: ZFS storage pool version 6 Jun 28 15:01:03 blackowl kernel: link_elf: symbol atm_event undefined Jun 28 15:01:03 blackowl kernel: KLD if_en.ko: depends on utopia - not available Jun 28 15:01:03 blackowl kernel: link_elf: symbol atm_event undefined Jun 28 15:01:03 blackowl kernel: KLD if_en.ko: depends on utopia - not available Jun 28 15:01:03 blackowl kernel: link_elf: symbol atm_event undefined Jun 28 15:01:03 blackowl kernel: KLD if_en.ko: depends on utopia - not available Jun 28 15:01:03 blackowl savecore: no dumps found Jun 28 15:01:05 blackowl kernel: en0: link state changed to UP Jun 28 15:01:15 blackowl ntpd[1064]: ntpd 4.2.0-a Wed Jun 25 15:59:39 EDT 2008 (1) Jun 28 15:01:32 blackowl smbd[1204]: [2008/06/28 15:01:32, 0, pid=1204] smbd/service.c:make_connection_snum(1003) Jun 28 15:01:32 blackowl smbd[1204]: '/Z/shared' does not exist or permission denied when connecting to [BIGSHARE] Error was No such file or directory Jun 28 15:01:42 blackowl sudo: ike : TTY=ttyp0 ; PWD=/usr/home/ ike ; USER=root ; COMMAND=/sbin/zpool status Jun 28 15:02:19 blackowl fsck: /dev/ad4s1e: 10 files, 7 used, 253808 free (40 frags, 31721 blocks, 0.0% fragmentation) Jun 28 15:02:36 blackowl kernel: fwohci0: BUS reset Jun 28 15:02:36 blackowl kernel: fwohci0: node_id=0xc800ffc1, gen=2, CYCLEMASTER mode Jun 28 15:02:36 blackowl kernel: firewire0: 2 nodes, maxhop <= 1, cable IRM = 1 (me) Jun 28 15:02:36 blackowl kernel: firewire0: bus manager 1 (me) Jun 28 15:02:36 blackowl kernel: fwohci0: BUS reset Jun 28 15:02:36 blackowl kernel: fwohci0: node_id=0xc800ffc0, gen=3, CYCLEMASTER mode Jun 28 15:02:36 blackowl kernel: firewire0: 1 nodes, maxhop <= 0, cable IRM = 0 (me) Jun 28 15:02:36 blackowl kernel: firewire0: bus manager 0 (me) Jun 28 15:05:18 blackowl fsck: /dev/ad4s1f: 245109 files, 14348367 used, 5963031 free (80527 frags, 735313 blocks, 0.4% fragmentation) Jun 28 15:05:32 blackowl fsck: /dev/ad4s1d: UNREF FILE I=70668 OWNER=smmsp MODE=100600 Jun 28 15:05:32 blackowl fsck: /dev/ad4s1d: SIZE=50 MTIME=Jun 27 18:20 2008 (CLEARED) Jun 28 15:05:32 blackowl fsck: /dev/ad4s1d: UNREF FILE I=329999 OWNER=root MODE=100600 Jun 28 15:05:32 blackowl fsck: /dev/ad4s1d: SIZE=3 MTIME=Jun 27 18:20 2008 (CLEARED) Jun 28 15:05:32 blackowl fsck: /dev/ad4s1d: UNREF FILE I=330000 OWNER=root MODE=140666 Jun 28 15:05:32 blackowl fsck: /dev/ad4s1d: SIZE=0 MTIME=Jun 27 18:20 2008 (CLEARED) Jun 28 15:05:32 blackowl fsck: /dev/ad4s1d: UNREF FILE I=330001 OWNER=root MODE=140600 Jun 28 15:05:32 blackowl fsck: /dev/ad4s1d: SIZE=0 MTIME=Jun 27 18:20 2008 (CLEARED) Jun 28 15:05:32 blackowl fsck: /dev/ad4s1d: UNREF FILE I=330004 OWNER=root MODE=100644 Jun 28 15:05:32 blackowl fsck: /dev/ad4s1d: SIZE=5 MTIME=Jun 27 18:20 2008 (CLEARED) Jun 28 15:05:32 blackowl fsck: /dev/ad4s1d: UNREF FILE I=330005 OWNER=root MODE=100600 Jun 28 15:05:32 blackowl fsck: /dev/ad4s1d: SIZE=79 MTIME=Jun 27 18:20 2008 (CLEARED) Jun 28 15:05:32 blackowl fsck: /dev/ad4s1d: Reclaimed: 0 directories, 18 files, 13 fragments Jun 28 15:05:32 blackowl fsck: /dev/ad4s1d: 6514 files, 690270 used, 818908 free (948 frags, 102245 blocks, 0.1% fragmentation) Jun 28 15:08:24 blackowl fsck: /dev/ad4s1g: 161522 files, 927711 used, 31501374 free (41102 frags, 3932534 blocks, 0.1% fragmentation) Jun 28 15:09:53 blackowl ntpd[1064]: kernel time sync disabled 2041 Jun 28 15:14:09 blackowl ntpd[1064]: kernel time sync enabled 2001 Jun 28 19:41:39 blackowl ntpd[1064]: kernel time sync enabled 6001 Jun 28 20:32:54 blackowl ntpd[1064]: kernel time sync enabled 2001 Jun 28 21:51:12 blackowl sudo: ike : TTY=ttyp1 ; PWD=/usr/home/ ike ; USER=root ; COMMAND=/sbin/zpool status ## Here's a dmesg from the system, basically the kernel is GENERIC with PF, HTTP acceleration stuff, and SMP compiled in- it's otherwise stock: ## Copyright (c) 1992-2008 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 7.0-RELEASE-p2 #4: Wed Jun 25 16:38:06 EDT 2008 root at blackowl.local:/usr/obj/usr/src/sys/IKEKERNEL-JAN-2008 Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Genuine Intel(R) CPU T2500 @ 2.00GHz (2000.00-MHz 686- class CPU) Origin = "GenuineIntel" Id = 0x6e8 Stepping = 8 Features = 0xbfe9fbff < FPU ,VME ,DE ,PSE ,TSC ,MSR ,PAE ,MCE ,CX8 ,APIC ,SEP ,MTRR,PGE,MCA,CMOV,PAT,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Features2=0xc1a9 AMD Features=0x100000 Cores per package: 2 real memory = 2137915392 (2038 MB) avail memory = 2082533376 (1986 MB) ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 ioapic0: Changing APIC ID to 2 ioapic0 irqs 0-23 on motherboard kbd1 at kbdmux0 ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) hptrr: HPT RocketRAID controller driver v1.1 (Jun 25 2008 16:37:50) acpi0: on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) acpi0: reservation of 0, a0000 (3) failed acpi0: reservation of 100000, 7f5e0000 (3) failed Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0 cpu0: on acpi0 ACPI Error (psparse-0626): Method parse/execution failed [\ \_PR_.CPU0._OSC] (Node 0xc52013a0), AE_ALREADY_EXISTS est0: on cpu0 p4tcc0: on cpu0 cpu1: on acpi0 ACPI Error (psparse-0626): Method parse/execution failed [\ \_PR_.CPU1._OSC] (Node 0xc5201300), AE_ALREADY_EXISTS est1: on cpu1 p4tcc1: on cpu1 acpi_button0: on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 vgapci0: port 0xff00-0xff07 mem 0xfde80000-0xfdefffff,0xd0000000-0xdfffffff,0xfdf80000-0xfdfbffff irq 16 at device 2.0 on pci0 agp0: on vgapci0 agp0: detected 7932k stolen memory agp0: aperture size is 256M vgapci1: mem 0xfdf00000-0xfdf7ffff at device 2.1 on pci0 pci0: at device 27.0 (no driver attached) pcib1: irq 16 at device 28.0 on pci0 pci1: on pcib1 pcib2: irq 17 at device 28.1 on pci0 pci2: on pcib2 em0: port 0xef00-0xef1f mem 0xfdde0000-0xfddfffff irq 17 at device 0.0 on pci2 em0: Using MSI interrupt em0: Ethernet address: 00:01:80:66:90:7a em0: [FILTER] uhci0: port 0xfe00-0xfe1f irq 23 at device 29.0 on pci0 uhci0: [GIANT-LOCKED] uhci0: [ITHREAD] usb0: on uhci0 usb0: USB revision 1.0 uhub0: on usb0 uhub0: 2 ports with 2 removable, self powered uhci1: port 0xfd00-0xfd1f irq 19 at device 29.1 on pci0 uhci1: [GIANT-LOCKED] uhci1: [ITHREAD] usb1: on uhci1 usb1: USB revision 1.0 uhub1: on usb1 uhub1: 2 ports with 2 removable, self powered uhci2: port 0xfc00-0xfc1f irq 18 at device 29.2 on pci0 uhci2: [GIANT-LOCKED] uhci2: [ITHREAD] usb2: on uhci2 usb2: USB revision 1.0 uhub2: on usb2 uhub2: 2 ports with 2 removable, self powered uhci3: port 0xfb00-0xfb1f irq 16 at device 29.3 on pci0 uhci3: [GIANT-LOCKED] uhci3: [ITHREAD] usb3: on uhci3 usb3: USB revision 1.0 uhub3: on usb3 uhub3: 2 ports with 2 removable, self powered ehci0: mem 0xfdfff000-0xfdfff3ff irq 23 at device 29.7 on pci0 ehci0: [GIANT-LOCKED] ehci0: [ITHREAD] usb4: EHCI version 1.0 usb4: companion controllers, 2 ports each: usb0 usb1 usb2 usb3 usb4: on ehci0 usb4: USB revision 2.0 uhub4: on usb4 uhub4: 8 ports with 8 removable, self powered pcib3: at device 30.0 on pci0 pci3: on pcib3 fwohci0: mem 0xfdaff000-0xfdafffff irq 19 at device 3.0 on pci3 fwohci0: [FILTER] fwohci0: OHCI version 1.0 (ROM=1) fwohci0: No. of Isochronous channels is 8. fwohci0: EUI64 00:01:80:13:94:66:90:7a fwohci0: Phy 1394a available S400, 2 ports. fwohci0: Link S400, max_rec 2048 bytes. firewire0: on fwohci0 dcons_crom0: on firewire0 dcons_crom0: bus_addr 0x7d018000 fwe0: on firewire0 if_fwe0: Fake Ethernet address: 02:01:80:66:90:7a fwe0: Ethernet address: 02:01:80:66:90:7a fwip0: on firewire0 fwip0: Firewire address: 00:01:80:13:94:66:90:7a @ 0xfffe00000000, S400, maxrec 2048 sbp0: on firewire0 fwohci0: Initiate bus reset fwohci0: BUS reset fwohci0: node_id=0xc800ffc4, gen=1, CYCLEMASTER mode pci3: at device 4.0 (no driver attached) isab0: at device 31.0 on pci0 isa0: on isab0 atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xfa00-0xfa0f at device 31.1 on pci0 ata0: on atapci0 ata0: [ITHREAD] ata1: on atapci0 ata1: [ITHREAD] atapci1: port 0xf900-0xf907,0xf800-0xf803,0xf700-0xf707,0xf600-0xf603,0xf500-0xf50f mem 0xfdffe000-0xfdffe3ff irq 19 at device 31.2 on pci0 atapci1: [ITHREAD] ata2: on atapci1 ata2: [ITHREAD] ata3: on atapci1 ata3: [ITHREAD] pci0: at device 31.3 (no driver attached) pmtimer0 on isa0 atkbdc0: at port 0x60,0x64 on isa0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] ppc0: parallel port not found. sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> sio0: configured irq 4 not in bitmap of probed irqs 0 sio0: port may not be enabled sio0: configured irq 4 not in bitmap of probed irqs 0 sio0: port may not be enabled sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 sio0: type 8250 or not responding sio0: [FILTER] sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Timecounters tick every 1.000 msec firewire0: 5 nodes, maxhop <= 4, cable IRM = 4 (me) firewire0: bus manager 4 (me) hptrr: no controller detected. acd0: CDRW at ata0-slave UDMA33 ad4: 114473MB at ata2-master SATA150 firewire0: New S400 device ID:00303c02e0126d83 firewire0: New S400 device ID:00303c02e0126d35 firewire0: New S400 device ID:00303c020012d1c4 da0 at sbp0 bus 0 target 0 lun 0 da0: Fixed Direct Access SCSI-4 device da0: 50.000MB/s transfers da0: 953869MB (1953525168 512 byte sectors: 255H 63S/T 121601C) da1 at sbp0 bus 0 target 1 lun 0 da1: Fixed Direct Access SCSI-4 device da1: 50.000MB/s transfers da1: 953869MB (1953525168 512 byte sectors: 255H 63S/T 121601C) da2 at sbp0 bus 0 target 2 lun 0 da2: Fixed Direct Access SCSI-4 device da2: 50.000MB/s transfers da2: 953869MB (1953525168 512 byte sectors: 255H 63S/T 121601C) SMP: AP CPU #1 Launched! Trying to mount root from ufs:/dev/ad4s1a WARNING: / was not properly dismounted WARNING: /tmp was not properly dismounted WARNING: /usr was not properly dismounted WARNING: /var was not properly dismounted WARNING: /xsdisk was not properly dismounted WARNING: ZFS is considered to be an experimental feature in FreeBSD. ZFS filesystem version 6 ZFS storage pool version 6 link_elf: symbol atm_event undefined KLD if_en.ko: depends on utopia - not available link_elf: symbol atm_event undefined KLD if_en.ko: depends on utopia - not available link_elf: symbol atm_event undefined KLD if_en.ko: depends on utopia - not available en0: link state changed to UP -- ikenote -- Let me point out one sidenote, for sanity sake, from the dmesg: WARNING: ZFS is considered to be an experimental feature in FreeBSD. But it's so good, I look foreword to the day that warning is gone :) Rocket- .ike From matt at atopia.net Mon Jun 30 01:24:09 2008 From: matt at atopia.net (Matt Juszczak) Date: Mon, 30 Jun 2008 01:24:09 -0400 (EDT) Subject: [nycbug-talk] OT: Secondary DNS Message-ID: <20080630012312.T86002@mercury.atopia.net> I need a secondary DNS provider for my company. Before I pay one of those expensive sites, or launch a VPS server strictly for DNS, thought I'd ping the list to see if anyone offered this service and/or could offer this service. I wouldn't mind paying something reasonable. -Matt From carton at Ivy.NET Mon Jun 30 03:23:53 2008 From: carton at Ivy.NET (Miles Nordin) Date: Mon, 30 Jun 2008 03:23:53 -0400 Subject: [nycbug-talk] ZFS and firewire - conditions for a perfect storm In-Reply-To: <55CF45DB-B144-45B9-8AF2-AD12D479B1E7@lesmuug.org> (Isaac Levy's message of "Sun, 29 Jun 2008 19:11:36 -0400") References: <55CF45DB-B144-45B9-8AF2-AD12D479B1E7@lesmuug.org> Message-ID: >>>>> "il" == Isaac Levy writes: il> 1) The firewire bus could possibly be loosing track of which il> device is which- and confusing ZFS. In my daisy-chain setup, il> when one drive in the chain dies, (say, da2), and it's removed il> from the chain, it seems to become the previous drive il> (e.g. da1). zpool export ; zpool import I think that will ``just work.'' il> (Anyone know about OpenSolaris/Firewire/ ZFS? How's that for il> esoteric :) yeah, I used this. I've used mirrors only, no raidz2. * I haven't fooled around with any of that OpenSolaris or Nexenta stuff. I've used only Solaris 10 U and various SXCE builds. * non-Oxford-911 case that I had, the case would crash. The case had to be rebooted. This was confusing because for a while I thought the driver/OS was messed up. * ZFS could handle a case crashing during use, but ZFS had problems if a case crashed during a scrub. * error reporting through the firewire bridge is not always fantastic, and smartctl would not pass through, so diagnosing failing disks is significantly harder when they're inside firewire cases. * for mirrors, ZFS wasn't great about remembering that the mirror was dirty and needed resyncing. If I rebooted during a resync, it wouldn't continue where it left off, and wouldn't start over---it would just quit trying to resync and accumulate checksum errors. The resync, when it did complete, often wasn't adequate to stop a stream of ``checksum errors'' over the next few weeks---I had to manually request a zpool scrub if half the mirror ever bounced. Because of some of these problems and cost, I've moved to ZFS-over-iSCSI. It's very slow and has problems still, but works better than the firewire did for me. I think ZFS is the Future, but the more I use it the less confidence I have in it. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: From quigongene at gmail.com Mon Jun 30 10:15:47 2008 From: quigongene at gmail.com (gene cronk) Date: Mon, 30 Jun 2008 10:15:47 -0400 Subject: [nycbug-talk] OT: Secondary DNS In-Reply-To: <20080630012312.T86002@mercury.atopia.net> References: <20080630012312.T86002@mercury.atopia.net> Message-ID: <7bb72ca70806300715v5b63682cja186ba7f741b0ced@mail.gmail.com> On Mon, Jun 30, 2008 at 1:24 AM, Matt Juszczak wrote: > I need a secondary DNS provider for my company. Before I pay one of those > expensive sites, or launch a VPS server strictly for DNS, thought I'd ping > the list to see if anyone offered this service and/or could offer this > service. > > I wouldn't mind paying something reasonable. www.afraid.org is what I use for all my domains. Donation based service, rock solid (has gone down only once on me in 6 year of use, and that was only b/c of a DoS attack), AND supports AAAA records :-). -------------- next part -------------- An HTML attachment was scrubbed... URL: From ike at lesmuug.org Mon Jun 30 11:24:44 2008 From: ike at lesmuug.org (Isaac Levy) Date: Mon, 30 Jun 2008 11:24:44 -0400 Subject: [nycbug-talk] ZFS and firewire - conditions for a perfect storm In-Reply-To: References: <55CF45DB-B144-45B9-8AF2-AD12D479B1E7@lesmuug.org> Message-ID: Hi Miles, Thanks for your input!, On Jun 30, 2008, at 3:23 AM, Miles Nordin wrote: >>>>>> "il" == Isaac Levy writes: > > il> 1) The firewire bus could possibly be loosing track of which > il> device is which- and confusing ZFS. In my daisy-chain setup, > il> when one drive in the chain dies, (say, da2), and it's removed > il> from the chain, it seems to become the previous drive > il> (e.g. da1). > > zpool export ; zpool import > > I think that will ``just work.'' Ah-ha- good thinking, but I did indeed try that, again I believe they suffer from the firewire-induced effects: - When all firewire disks are online and obviously healthy, 'zpool export' and 'zpool import' work as expected. - When one (any) device is offline and obviously dead, 'zpool export' gives me: [root at blackowl /usr/home/ike]# zpool export Z cannot unmount '/Z/shared': Device busy [root at blackowl /usr/home/ike]# - Or, sometimes it just hangs like I described previously. I haven't been able to reliably isolate different causes failed behaviors, drive order or failure order doesn't seem to give me one or the other, it just truly seems random which failure I get. + Which leads me to believe further that this is an issue with firewire driver event notifications in the kernel. > > > il> (Anyone know about OpenSolaris/Firewire/ ZFS? How's that for > il> esoteric :) > > yeah, I used this. I've used mirrors only, no raidz2. No kidding! Cool- I'm not surprised with you Miles :) > > > * I haven't fooled around with any of that OpenSolaris or Nexenta > stuff. I've used only Solaris 10 U and various SXCE builds. > > * non-Oxford-911 case that I had, the case would crash. The case had > to be rebooted. This was confusing because for a while I thought > the driver/OS was messed up. > > * ZFS could handle a case crashing during use, but ZFS had problems > if a case crashed during a scrub. > > * error reporting through the firewire bridge is not always > fantastic, and smartctl would not pass through, so diagnosing > failing disks is significantly harder when they're inside firewire > cases. Gah- I have the same frustrating problem with firewire, when using smartctl from FreeBSD on the firewire drives. + Again, and after digging around lists online, this one leads me to believe that the only people who've done a great job implementing firewire is Apple, (it's theirs to begin with). It makes me somewhat sad, firewire has been SO RELIABLE and flexible on OSX systems for years... and now it's cheaper gear than ever. > > > * for mirrors, ZFS wasn't great about remembering that the mirror was > dirty and needed resyncing. If I rebooted during a resync, it > wouldn't continue where it left off, and wouldn't start over---it > would just quit trying to resync and accumulate checksum errors. > The resync, when it did complete, often wasn't adequate to stop a > stream of ``checksum errors'' over the next few weeks---I had to > manually request a zpool scrub if half the mirror ever bounced. Yikes. That's kindof unacceptable behavior for a system one wishes to trust. I think most people would agree, filesystems simply *must* be the most refined, reliable, and unchanging part of any system. > > > Because of some of these problems and cost, I've moved to > ZFS-over-iSCSI. It's very slow and has problems still, but works > better than the firewire did for me. Digit. For now, since I'm starting from scratch, I've split my firewire disks up into 3 machines, so I have a 3rd backup for the future. Against my best wishes, I'll keep using the Apple machines for storage- the Journaling of HFS+ (Case-sensitive!) is a well-trusted and easy path for disks which *I'll never have to fsck* - my most necessary feature on multi-TB systems- (especially at home, where my time hacking other stuff and using my data is precious). For production/work systems, the Apple gear doesn't meet most needs on more critical levels- and I can gladly accept fsck and use UFS there in most applications. -- I believe for any future growth at home, I'll simply start thinking towards using SATA and known good controllers, (Areca, 3ware, Adaptec, etc...). Sad part here is that this means no I won't be able to use old laptops or mini-pc's as (slow but silent) file servers, which have worked out very nicely in my tiny apartment. I wince at the thought of having to drop cash on silent pc gear- yuck. > > > I think ZFS is the Future, but the more I use it the less confidence I > have in it. Yeah, I think ZFS is the future too- and is simply a matter of time and maturing. I think it's biggest enemy right now is complexity- it's a very feature packed filesystem for users (why it's so cool!!!!!), but I don't see this as any different than the history of UFS or my history with HFS/+, all the filesystems I've trusted over the years have had their features boiled down to extremely simple and reliable defaults- from a user perspective. ZFS still seems to have a foot in the zone between developers and users. For UFS, ACL's, heck- softupdates (1999), and all the tunable features seem to have taken years to work out and become the trusted media we know and love now. /me sighs and goes back to other hacking... Rocket- .ike From carton at Ivy.NET Mon Jun 30 16:25:54 2008 From: carton at Ivy.NET (Miles Nordin) Date: Mon, 30 Jun 2008 16:25:54 -0400 Subject: [nycbug-talk] ZFS and firewire - conditions for a perfect storm In-Reply-To: (Isaac Levy's message of "Mon, 30 Jun 2008 11:24:44 -0400") References: <55CF45DB-B144-45B9-8AF2-AD12D479B1E7@lesmuug.org> Message-ID: >>>>> "il" == Isaac Levy writes: il> [root at blackowl /usr/home/ike]# zpool export Z cannot unmount il> '/Z/shared': Device busy maybe this is the freebsd version of 'no valid replicas', the generic banging-head-against-wall message Solaris gives you when it's trying to ``protect'' you from doing something ``dumb'' like actually fixing your fucked-up array. you can try erasing zpool.cache and then 'import -f'. il> - Or, sometimes it just hangs like I described previously. I find 'zpool status' hangs a lot. A status command should never never never cause disk I/O or touch anything that could uninterruptable-sleep. Especially, a system-wide status command needs to not hang because one pool is messed up, any more than it's acceptable for failures in one pool to impact availability of the whole ZFS subsystem (which AFAIK they correctly don't spill over, in terms of stable/fast filesystem access to pools other than the one with problems. but for 'zpool status', they do, so if you consider the zpool command part of the ZFS subsystem then they do spillover.) il> + Again, and after digging around lists online, this one leads il> me to believe that the only people who've done a great job il> implementing firewire is Apple, (it's theirs to begin with). I just tried it, and smartctl doesn't work for me over firewire on Apple either. I'm using the smartctl in NetBSD pkgsrc and Mac OS 10.5.3. I think it's a limitation of the firewire bridge chip, not the OS's driver stack. well...it is a limitation fo the OS stack in that there's no defined way to pass the commands through the bridge, so the OS doesn't implement them, but the real limitation is in the bridge chip and the standards that define how they should work. i think. It's odd that DVD burners ``just work'' i guess. but...i bet, for example, those special commands one can send to Lite-On drives to make them rpc1 so dvdbackukp works better, would not pass through a firewire bridge. untested though. of course the error reporting stuff may be a different story, may actually be firewire stack problems, but again I would expect the case to interfere with error reporting and some cases to handle disks going bad better than others. il> -- I believe for any future growth at home, I'll simply start il> thinking towards using SATA and known good controllers, il> (Areca, 3ware, Adaptec, etc...). from what I've heard/understood, be sure to get a battery because it's necessary for correctness, not just for speed. Otherwise you need to do RAID3 which means you need a filesystem that supports large sector sizes which you don't have. Another thing to worry about with this RAID-on-a-card crap is controllers going bad. If I were using such a controller rather than ZFS, I'd buy a spare controller and put it on the shelf (in case the model which understands my RAID metadata goes out of production), and I'd test the procedure for moving disks from one controller to another BEFORE the controller breaks, and BEFORE putting any data on the raidset. il> Yeah, I think ZFS is the future too- and is simply a matter of il> time and maturing. yeah, but it's really not maturing very quickly at all compared to SVM, LVM2, ext3, HFS+, netapp/emc/vendorware storage stuff, or basically anything at all that's not dead-in-the-water abandonware like FFS/LFS/RAIDframe. It seems to be maturing at about the same speed as Lustre, which is too fucking slow. I don't know what the hell they _are_ working on, besides this stability stuff. If I had a Sun support contract I'd have opened at least five big fat bugs and would be pestering them monthly for patches. There are known annoying/unacceptable problems they are not fixing after over two years. When Solaris 11 ships it is still oging to be rickety flakey bullshit. It's not exactly a disappointment, but it IS flakey bullshit. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: From ike at lesmuug.org Mon Jun 30 19:42:54 2008 From: ike at lesmuug.org (Isaac Levy) Date: Mon, 30 Jun 2008 19:42:54 -0400 Subject: [nycbug-talk] ZFS and firewire - conditions for a perfect storm In-Reply-To: References: <55CF45DB-B144-45B9-8AF2-AD12D479B1E7@lesmuug.org> Message-ID: <3A62ECCD-3074-4F40-8F68-4A1B2D9490F7@lesmuug.org> So, I think I'm coming to a modified marketing slogan for ZFS. "ZFS Likes Cheap Disks, especially SATA/PATA, not so hot for firewire, and who knows about USB", On Jun 30, 2008, at 4:25 PM, Miles Nordin wrote: >>>>>> "il" == Isaac Levy writes: > > il> [root at blackowl /usr/home/ike]# zpool export Z cannot unmount > il> '/Z/shared': Device busy > > maybe this is the freebsd version of 'no valid replicas', the generic > banging-head-against-wall message Solaris gives you when it's trying > to ``protect'' you from doing something ``dumb'' like actually fixing > your fucked-up array. > > you can try erasing zpool.cache and then 'import -f'. > > il> - Or, sometimes it just hangs like I described previously. Cool- thx for the heads-up on this approach, I'm learning a lot more about ZFS... (stuff I didn't necessarily want to know :) However, for the record here, I just tried unplugging a drive as before (to bring on a disk I/O hang), deleted the zpool.cache, and tried 'import -f' - and it's all just hung. The OS keeps chugging along nicely though, (UFS2 on an internal disk). /me sighs, reboots, and starts fresh again... > > > I find 'zpool status' hangs a lot. A status command should never > never never cause disk I/O or touch anything that could > uninterruptable-sleep. Especially, a system-wide status command needs > to not hang because one pool is messed up, any more than it's > acceptable for failures in one pool to impact availability of the > whole ZFS subsystem (which AFAIK they correctly don't spill over, in > terms of stable/fast filesystem access to pools other than the one > with problems. but for 'zpool status', they do, so if you consider > the zpool command part of the ZFS subsystem then they do spillover.) > > il> + Again, and after digging around lists online, this one leads > il> me to believe that the only people who've done a great job > il> implementing firewire is Apple, (it's theirs to begin with). Oy- you are correct here Miles! On an Apple machine, using a firewire disk, after installing smartmontools, I can't get even a lick of info out of the firewire drive: plumb:~ ike$ smartctl -a disk8 smartctl version 5.38 [i386-apple-darwin9.3.0] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ Smartctl open device: disk8 failed: Operation not supported by device plumb:~ ike$ -- And using apple's diskutil, good stuff like SMART isn't supported: plumb:~ ike$ diskutil info disk8 Device Identifier: disk8 Device Node: /dev/disk8 Part Of Whole: disk8 Device / Media Name: WiebeTech Volume Name: Mount Point: Partition Type: GUID_partition_scheme Bootable: Not bootable Media Type: Generic Protocol: FireWire SMART Status: Not Supported Total Size: 931.5 Gi (1000204886016 B) (1953525168 512-byte blocks) Free Space: 0.0 B (0 B) (0 512-byte blocks) Read Only: No Ejectable: Yes Whole: Yes Internal: No OS 9 Drivers: No Low Level Format: Not Supported plumb:~ ike$ -- Wow. Firewire is kindof making me sad. > > > I just tried it, and smartctl doesn't work for me over firewire on > Apple either. I'm using the smartctl in NetBSD pkgsrc and Mac OS > 10.5.3. I think it's a limitation of the firewire bridge chip, not > the OS's driver stack. well...it is a limitation fo the OS stack in > that there's no defined way to pass the commands through the bridge, > so the OS doesn't implement them, but the real limitation is in the > bridge chip and the standards that define how they should work. > > i think. It's odd that DVD burners ``just work'' i guess. but...i > bet, for example, those special commands one can send to Lite-On > drives to make them rpc1 so dvdbackukp works better, would not pass > through a firewire bridge. untested though. > > of course the error reporting stuff may be a different story, may > actually be firewire stack problems, but again I would expect the case > to interfere with error reporting and some cases to handle disks going > bad better than others. > > il> -- I believe for any future growth at home, I'll simply start > il> thinking towards using SATA and known good controllers, > il> (Areca, 3ware, Adaptec, etc...). > > from what I've heard/understood, be sure to get a battery because it's > necessary for correctness, not just for speed. Otherwise you need to > do RAID3 which means you need a filesystem that supports large sector > sizes which you don't have. Ah- well, it depends on the controller- whole other thing. I meant that I'd snag some fairly inexpensive and well supported SATA cards with lots of ports, and use them for ZFS volumes- and ditch firewire. ZFS doesn't seem to have these gross problems at all with the SATA stuff I've used- (Areca, Adaptec, 3Ware). And yeah I agree- don't skimp on the batteries for a given controller if you use it for hardware RAID :) > > > Another thing to worry about with this RAID-on-a-card crap is > controllers going bad. If I were using such a controller rather than > ZFS, I'd buy a spare controller and put it on the shelf (in case the > model which understands my RAID metadata goes out of production), and > I'd test the procedure for moving disks from one controller to another > BEFORE the controller breaks, and BEFORE putting any data on the > raidset. Buying cards to put on the shelf is actually a plan I've put in action several times in recent years- (after getting stuck with ancient and irreplaceable Compaq cards going bad...) A trend I like seeing recently, which changes this game, is that Supermicro and Tyan server motherboards are coming with 8 SATA ports onboard, with something like an LSI card built-in. For the 1u high- density boxes I tend to deploy for jobs, they get deployed in pairs or triples- and usually some component failure happens either immediately (warranty replacement) or well after the working life of the machines is past (3-4 yrs). I've rarely seen the machines/cards/etc fail in the middle space, but that's just my experiences... > > > il> Yeah, I think ZFS is the future too- and is simply a matter of > il> time and maturing. > > yeah, but it's really not maturing very quickly at all compared to > SVM, LVM2, ext3, HFS+, netapp/emc/vendorware storage stuff, or > basically anything at all that's not dead-in-the-water abandonware > like FFS/LFS/RAIDframe. It seems to be maturing at about the same > speed as Lustre, which is too fucking slow. I don't know what the > hell they _are_ working on, besides this stability stuff. If I had a > Sun support contract I'd have opened at least five big fat bugs and > would be pestering them monthly for patches. There are known > annoying/unacceptable problems they are not fixing after over two > years. When Solaris 11 ships it is still oging to be rickety flakey > bullshit. It's not exactly a disappointment, but it IS flakey > bullshit. Hrmph. Yeah, I do worry about things maturing fast enough to stay alive long term. With disks, buggy crap like this have to go away really FAST or else users will... Rocket- .ike