[nycbug-talk] Statistical Monitoring

Wed Nov 5 11:03:51 EST 2008

>>> Once you get ganglia up and running, all it will do is provide you 
>>> with statistics and it will be up to you to figure out a way to 
>>> collect and store them and then do something meaningful with them.

The problem I have is I want TONS of graphs.  I want to graph our load 
balancer, our firewall, our CPU usage for specific processes across 
servers (apache, memcache, mysql, etc.), memory usage (free/available), 
mysql statistics (threads running, queries running, long running queries, 
average query time, seconds behind master, etc.), and much much more.  If 
I have all of these statistics being reported (and graphed), then is this 
something that reliably, a pull method can perform well?  I've used SNMP a 
lot to gather basic statistics, but I doubt I'd be able to get SNMP to 
broadcast what the current queries per second are on the local MySQL 
server easily (I know its possible - there's an SNMP module for MySQL, but 
I doubt its trivial).  Wouldn't something like this be better as a script 
running on ALL servers to gather the statistics and push those statistics 
to a centralized daemon of sorts running on the server?

But since I also need to graph things that are snmp-based (for instance, 
our load balancer information can only be obtained via snmp), my thoughts 
are that using cacti is most likely the best option, but I'd have to use 
the custom-graph-with-scripts option more often.  Or, like I asked, 
perhaps using ganglia to push the statistics, and then running a script on 
the cacti server to convert the ganglia data into graphs?