[nycbug-talk] Stablity fixed??? Maybe???
ike at lesmuug.org
Mon Jul 4 17:03:06 EDT 2005
Your comment made me want to share an experience from when I worked
with a web hosting company,
On Jul 3, 2005, at 11:18 PM, Matt Juszczak wrote:
>> so you where not monitoring the system while you where doing the
>> you could also try running some combo of script+screen or syslog to
>> get data of system resource usage/interrupt usage etc. this may give
>> us more info into what exactly is going on your system when it stops
>> responding...has the box wedged or are there just a ton of requests
>> that are starving resources like sshd from responding...
> I think its the ton of requests response ... the box hasn't done it
> since I added some changes into postfix's config.
This makes me ask myself, could your server be under deliberate
malicious attack, and postfix is choking things?
Here's why I come to that conclusion:
At my small old web-hosting, we had over 1,000 hosted domains when I
worked there- lots of activity.
Mail, became our worst nightmare- (and time/money waster), because it
most frequently became the target for spacker attacks. (Wired made
up that word, it fits: http://www.encyclopedia-online.info/Spacker)
There's serious money providing incentive for all the nastiest system
crackers to own boxes for the spammers...
Anyhow, we dealt with attacks of all shapes and sizes, but one I find
relevant to your situation here:
Email Subject string-overflow attacks. The attacker was sending
massive volumes of spam (DDOS through bot-nets) to our box with over
256 characters in the subject heading.
RESULT OF ATTACK:
Cyrus was our MTA of choice then, with Exim as the LTA (On FreeBSD
4.x back then). This attack was aimed at an esoteric flaw in the
mechanism Cyrus uses to hand off messages to Exim for local delivery,
a queue which used BerkeleyDB would explode, and the MTA would hang,
and die, without much to go on in the logs.
INTERPERTED AIM OF ATTACK:
There were other reports that attackers were performing this attack
to try to get us to bring a fresh Cyrus box online to replace it,
hoping that we'd bring it online and still be configuring it, so they
could take advantage of the fresh box while we were configuring it-
and attempt to root the system alltogether. This exact situation
happened to another ISP, and they were effectively blackmailed- as
the attacker didn't care if email came in/out from all the user
accounts while they spammed, (and any other ISP's clients sure
do!!!) In their case, they had to take down the mail servers,
rebuild them offline, while dealing with a ton of support calls from
angry customers who wanted their email.
To find this problem, a few troubleshooting methods (out of many
things) were valuable:
ktrace was used to finally find the MTA was locking the server (we
had no core dumps, anything, to go from)
network sniffers (ettercap then) was used to scan all mail coming in,
from a neighboring box- and we found the extra-long subject headings
google employed to find some esoteric notes on others who'd faced
similar attacks, and we contacted them immediately. They shyly gave
us info about how they resolved their attack, who attacked them
etc..., and that helped us out a lot.
We riped out the part of Cyrus that talks to the LTA, the stuff which
uses BerkelyDB, and replaced it with a different embedded DB-
(skiplist), which was modified to mitigate this, and a number of
other problems- and I believe that the Cyrus folks integrated a fix
in a later release.
All of that, just for the spackers.
Anyhow, just thought I'd share the story, it may help you Matt, or
help someone in the future. Mail is the roughest stuff to manage on
the internet now IMHO- most attacked, most important... (which is why
I now personally like to give it to Mail-specific hosting vendors, so
I can focus on what I do... :)
More information about the talk