[KLUG Members] Tuning qmail

Adam Williams members@kalamazoolinux.org
Wed, 11 Feb 2004 19:20:58 -0500


> I take care of a few qmail servers, one of which is my work's primary mail 
> server. It is a dual 700 MHz PIII with 1 GB of RAM, two 9 GB SCSI drives 
> and another pair of 72 GB SCSI drives in a RAID-1 format (software).

You've set "noatime" on the mail volume right?  If not that will help
the I/O load ALOT.

> Since the advent of mydoom, load on the poor thing hasn't dropped below 4.  
> It's rejected 30k messages already because of mydoom, and since it doesn't 
> appear to be letting up, 

Beautiful isn't it.

> I need to find a way to make the server *usable* 
> again.  Spamassassin, qmail-scan, 

Yep, did someone think executing hundreds of lines of perl for every
mail message a good idea?  Just dumb.  Hello, my name is "C"!

Ditch all that complicated crap.

Setup -
1.) SPAMCop, support for this is built into almost all MDA/MTAs.  All it
takes is a DNS query per message.  We've found this blocks almost all
SPAM, and anything that comes through can be easily reported.  This is
Morrison's only anti-SPAM mechanism and we get almost none.
2.) noattach - this is a simple milter written in, you guessed it -
"C"!  It simply bounces out messages with attachments of certain types
(.pif, .scr, etc... - or whatever you set).
3.) clamav - this is an anti-virus tool - both the scanning daemon (yes
a daemon, no forking processes per message) and the milter are written
in - yes! - "C".  Fast.  1 & 2 cream most bogus messages, so clamav only
has to scan what is probably a legit message.

#2 moved us from scanning/rejecting +3000 messages a day to ~20.  And
anit-virus scanning is resource intensive.

> and the software RAID pretty much 
> guarantee a sluggish system.

On a post-MMX processor the load introduced by software RAID is really
pretty negligable.

> Is there anything I can do to speed things up?  Some people say 'cut down 
> on your concurrent connections', but I'm rarely using more than 10 at a 
> time.   I don't know what else I can do beyond that, but man... The poor 
> thing is crushed from I/O...

First - "noatime".

Your using ext2/3?  How full is it?  After ~40% capacity ext2/3 starts
to loose performance.  Is it too radical to move to XFS?

Can you move the journal for the mail volume to another set of physical
disks?  Moving the journal to a different SCSI channel than the
filesystem helps throughput (drives are cheap, and the journal doesn't
need to be very big).

Is /var/log it's own volume?  If so make sure that is "noatime"'d too.

Are these Adaptec SCSI adapters?  If so have you made sure TCQ is
enabled?

You made sure you set the stride option optimally when you created the
software RAID volumes? (Again assuming ext2/3).

If your already using XFS have you tried increasing the "logbufs"?

You have nscd enabled?

Do you processes alot of large messages?  If so maybe you want to try
cranking up net.core.wmem_default & net.core.wmem_max, assuming you have
sufficient RAM.

Increase the number of dirty pages permitted between bdflush
iterations,  this will make flushing dirty cache blocks to disk more
bursty and thus more efficient - this especially helps on SCSI systems
where controllers and disks can intelligently reorder operations.  Thus
handing them operations in large chunks lets them flex this intelligence
better than just constantly dribbling operations out to them.

Disable fsync() operations in syslog on the maillog (usually prefixing
/var/log/maillog in /etc/syslog.conf with a "-" character).  This can
also be a BIG win.  You probably don't want to do this on
/var/log/messages becuase you will potentially loose crash related
information.