[nycbug-talk] Statistical Monitoring
matt at atopia.net
Tue Nov 4 15:09:17 EST 2008
> I'm running Nagios + pnp4nagios which takes the extra data that the
> nagios service checks picks up and makes RRD/Cacti graphs out of them.
> I did this to reduce the amount of polling which can skew results, and
> soaks up resources for those times when you really need the graphs.
> Also it's all wrapped up in one place to maintain.
Sounds cool, but I'm running a lot of my checks via check_by_ssh, so when
things get bogged down, I tend to get a lot of "plugin timeout".
Technically, I could switch these to SNMP checks, and/or passive checks,
which would help a lot, but there are many things I want to graph that I
don't want to alert on -- such as each webserver's input/output on the
NIC, I/O on hard disk, etc. Would I just create these as checks inside
nagios but just never set a critical or warning level for them? Or is it
better to use something different since there are so many checks that I
don't want to monitor for alerts?
More information about the talk