[nycbug-talk] regular hardware troubleshooting/monitoring

David Rio Deiros driodeiros
Thu Jun 23 13:34:30 EDT 2005

On Thu, Jun 23, 2005 at 10:42:45AM -0400, George Georgalis wrote:
> On Thu, Jun 23, 2005 at 10:13:41AM -0400, Ray wrote:
> >On Wed, Jun 22, 2005 at 10:50:45PM -0700, David Rio Deiros wrote:
> >> I cannot see how to test the memory without rebooting the machine.
> recompile a kernel 20 times and pipe stderr/stdout to a file,
> compare files sizes... pretty darn effective, for a running machine

This method will test most of the machine's hardware but you are
not testing ALL the memory. Am I right?

> >> Regarding to the CPU, pretty much the same.... Well... you can actually
> >> run programs like cpuburn but those are going to put your CPU to 0%
> >> idle. Something you don't want in a production server.
> >
> >nice(1).
> and you may well learn the burden continuous context switching puts on
> your machine...

Oh yes.. 

> So how can I test/salvage? My guess is it's the south bridge, but short
> of investing $30 in Artic silver glue to see if the problem goes away
> (which I doubt because that chip doesn't really get hot), I'm not sure
> how to tell, ditto for the cpu, don't want to replace if it's not broke
> and that's got a nice new fan on it... so what is broke? and how can I
> tell?
> (I don't expect a useful answer here...)

Ok. :)

More information about the talk mailing list