[nycbug-talk] Runaway cron server
Matt Juszczak
matt at atopia.net
Sat Oct 31 21:01:50 EDT 2009
Hi all, Happy Halloween...
I've been having some issues with a runaway cron server. We've got crons
setup, and I'm using a locking system to make sure no cron runs
overlapping another cron (though this problem was occuring prior to the
locking system being put in place). After a day or two, our server load
spikes, the crons stop working, and top shows:
6702 root 1 97 0 16292K 3312K RUN 3 13:50 4.79% cron
65338 root 1 96 0 16328K 3324K RUN 3 138:23 4.59% cron
69837 root 1 96 0 16328K 3324K RUN 3 116:05 4.59% cron
90642 root 1 96 0 16328K 3324K CPU2 2 37:39 4.59% cron
65729 root 1 96 0 16328K 3324K RUN 3 136:01 4.49% cron
79591 root 1 96 0 16328K 3324K RUN 0 80:51 4.49% cron
85363 root 1 96 0 16328K 3324K RUN 0 64:42 4.49% cron
90625 root 1 96 0 16328K 3324K CPU0 0 51:58 4.49% cron
82872 root 1 96 0 16328K 3324K RUN 3 50:16 4.49% cron
83551 root 1 96 0 16292K 3312K RUN 3 49:13 4.49% cron
80016 root 1 96 0 16328K 3324K RUN 1 79:37 4.39% cron
85758 root 1 96 0 16292K 3312K RUN 0 63:36 4.39% cron
90284 root 1 96 0 16328K 3324K RUN 2 52:45 4.39% cron
61636 root 1 96 0 16328K 3324K RUN 2 171:26 4.30% cron
And even more info:
s505# ps auxw | grep cron | wc
105 1464 10026
If I try to truss or ktrace one of the processes, it returns no output.
This behavior is reliable and occurs every single time. I'll restart the
cron server, and things will run fine for a little while, but will then
get to this point again.
Any ideas? I'm really stuck.
-Matt
More information about the talk
mailing list