[nycbug-talk] Random Box Reboots

Mike Sawicki fifi
Wed Jul 27 13:20:59 EDT 2005


On Wed, Jul 27, 2005 at 01:15:39PM -0400, Hans Zaunere wrote:
> 
> 
> On Wednesday, July 27, 2005 1:02 PM, talk-bounces at lists.nycbug.org wrote:
> > On Wed, Jul 27, 2005 at 12:48:34PM -0400, Charles Sprickman wrote:
> > > On Wed, 27 Jul 2005, Hans Zaunere wrote:
> > > 
> > > > I'm stuck on what could be causing this and how to troubleshoot it;
> the
> > > > box has only mild load.  Any thoughts appreciated.
> > > 
> > > Just in case, I'd enable coredumps, maybe you'll get "lucky" and have
> > > something to look at: 
> > > 
> > > dumpdev="/dev/ad0s1b" in rc.conf (adjust to your swap partition)
> > > 
> > > Also make sure you've built a kernel with debug symbols:
> > > 
> > > makeoptions     DEBUG=-g        in kernel config
> > > 
> > > Lastly, for problematic hosts in the past I've put together a quick
> shell
> > > script that runs a whole mess of stuff; a full ps, vmstat, netstat, etc.
> > > and writes it to a logfile.  I call it out of cron every five minutes.
> > > That way if something does happen and there's no coredump I can see if
> > > anything odd was happening before the crash.
> > > 
> > 
> > I'd also advise you to stop using GENERIC on a production server.
> > Strip down the kernel to the bare minimum and make better use of
> > modules.  This could be hardware related.. but if it hums along for
> > multiple months I'd be surprised at that.
> 
> Yeah, I was thinking about changing the kernel - but with such small load,
> and stability for months without a problem - it seems that a kernel change
> would be too big.  But, could be it I suppose.
>
> > You aren't using nullfs anywhere are you?
> 
> Yes, but only in read-only mode - why?  I know there are a number of
> stability issues, but in read-only, it should be better...
> 

Hrm.. not sure about that one.  If you enable the debugging Charles
recommended you'll know for sure when you analyze the dump.  I've
been using nullfs in 5.4-STABLE, and it's fine.. but I think they've
just given up on it completely under 4.x.  I wouldn't trust it.

If you are going to rebuild with DEBUG anyway, strip out some crap
while you're at it =)


--
Mike Sawicki (fifi at HAX.ORG)




More information about the talk mailing list