[nycbug-talk] Statistical Monitoring

pete pete at nomadlogic.org
Wed Nov 5 11:22:48 EST 2008


On Wed, 5 Nov 2008 11:03:51 -0500 (EST), Matt Juszczak <matt at atopia.net>
wrote:
>>>> Once you get ganglia up and running, all it will do is provide you 
>>>> with statistics and it will be up to you to figure out a way to 
>>>> collect and store them and then do something meaningful with them.
> 
> The problem I have is I want TONS of graphs.  I want to graph our load 
> balancer, our firewall, our CPU usage for specific processes across 
> servers (apache, memcache, mysql, etc.), memory usage (free/available), 
> mysql statistics (threads running, queries running, long running queries,

> average query time, seconds behind master, etc.), and much much more.  If

> I have all of these statistics being reported (and graphed), then is this

> something that reliably, a pull method can perform well?  I've used SNMP
a 
> lot to gather basic statistics, but I doubt I'd be able to get SNMP to 
> broadcast what the current queries per second are on the local MySQL 
> server easily (I know its possible - there's an SNMP module for MySQL,
but 
> I doubt its trivial).  Wouldn't something like this be better as a script

> running on ALL servers to gather the statistics and push those statistics

> to a centralized daemon of sorts running on the server?
> 

heck i'd look at it from the other perspective.  running a dedicated snmpd
on your server is going to be much more light weight than running a home
grown script written in an interpreted lang like perl or python.  in my
personal experience monitoring lots of gear ranging from switches/routers
to load balancers and servers i find that SNMP is the way to go.  it is
quite light weight and it's the only way you are gonna be able to have
consistent counters b/w switches and servers (for monitoring network
traffic for example).

> But since I also need to graph things that are snmp-based (for instance, 
> our load balancer information can only be obtained via snmp), my thoughts

> are that using cacti is most likely the best option, but I'd have to use 
> the custom-graph-with-scripts option more often.  Or, like I asked, 
> perhaps using ganglia to push the statistics, and then running a script
on 
> the cacti server to convert the ganglia data into graphs?

there is nothing to say that you cant write your own OID that lives on your
servers that runs a script - say something showing how many active MySQL
connections you have active at a given time.  that'll still be more light
weight than running a perl/python daemon that you write since you can use
the SNMP protocol to execute these queries.  you can still get the
customization you want - but gain the consistency of just using SNMP across
all your network devices as well which i think is a huge win long term
support wise.

i've also found that %80 of the info that I am interested in is already
available easily via stock snmp configs - process counts, memory info,
network counters, cpu load, users logged in etc...

just my two bits though..

-pete

-- 
Pete Wright
pete at nomadlogic.org
310.869.9459



More information about the talk mailing list