[nycbug-talk] network strangeness (resource starvation?)
Jim Brown
jpb
Sun Jul 31 10:17:55 EDT 2005
* Charles Sprickman <spork at bway.net> [2005-07-30 22:45]:
> Hey all,
>
> I've pursued this on other lists for a few years, and it's getting under
> my skin more and more. Maybe someone here can give me some pointers...
>
> Have a host (FBSD 4.9) that does lots of dns work - both queries from
> outside and other hosts inside doing a ton of lookups.
>
> We have run up against a few hurdles and cleared them. First it was
> ipfilter running out of state entries. Upped the size in "ip_state.h" in
> the ipfilter includes, and that helped. Eventually we hit another wall,
> so we relaxed the ipfilter rules to make them work for inbound/outbound
> without generating state entries. Since then, no problems reported by
> "ipfstat -s" that would indicate we're running out of resources there.
>
> One of the ongoing symptoms is that ssh sessions to the box will start
> *dropping* characters when udp traffic is really high. Even after we
> solved the problem of outgrowing the state table, the problem still
> remains.
>
> We've bumped a number of things, nmbclusters is way up there, and netstat
> -m shows that we're not hitting a peak there. However looking at full
> "netstat -s" stats after the box only being up for less than 12 hours
> shows this:
>
> 8297 dropped due to no socket
> 0 broadcast/multicast datagrams dropped due to no socket
> 31 dropped due to full socket buffers
>
> So that's a hint. I can look for whatever obscure sysctl variable to set
> the listen queue deeper. Not sure about the "no socket"...
>
> Lately the newest wrinkle is that the box will just go unresponsive.
> Pingable, but nothing on serial console, no ssh.
>
> So can things getting starved in udp-land cause other networking stuff to
> choke? Any pointers where else to look?
>
> Thanks,
>
> Charles
Hi Charles,
What does 'lots of dns work' mean? Can you give some I/O stats?
What DNS software are you running? What is the hw platform?
How many zones? Avg size of a zone?
Some general thoughts:
- eliminate all other non-essential services
- re-nice named
- try redesigning your DNS service to include multiple servers
and load balance between them
- BIND (if that's what you are using) is a notorious memory hog
However it's still my favorite DNS server.
Increase memory to the limit.
I know it's not what you want to hear, but when I made the
switch from 4.10 to 5.4 i was *impressed* with the performance
under heavy load. I tested using:
X+KDE with multiple konsole sessions:
- multiple FTPs
- two different stress sessions, one memory, one disk
- a loop of 'make buildworld'
- a loop of 'make buildkernel'
- a loop of 'ls -alR /'
- ssh session to remote host
The desktop was still usable, and I didn't lose any ssh characters on remote sessions.
IBM T41 with 512 MB.
Best Regards,
Jim B.
More information about the talk
mailing list