[KLUG Members] Sendmail, LDAP, and solaris
Adam bultman
adamb at glaven.org
Fri Oct 27 23:49:33 EDT 2006
Good evening, Kluggers.
I have two issues that I would like input on. They are both related as
they both deal with LDAP.
Item 1: Solaris LDAP native problems
I have a few mail servers that are "LDAP Native", which is to say, they
have been configured to access LDAP servers for username, password, and
group information. If the user isn't in /etc/passwd, it'll query LDAP.
Great fun, and not that difficult to do.
These mail servers are configured with sendmail, procmail, and popper,
all of which query LDAP when doing anything. Sendmail is not configured
to be 'LDAP aware' (which I will explain later) but is querying Solaris,
which is then querying the LDAP server. Whenever there is a heavy
amount of traffic to the servers, LDAP will get sluggish on the server.
Users will be able to get their mail via POP, which requires LDAP, and
sendmail will process mail, which requires LDAP, but but it's not very
'zippy'. For example, if you perform a "ps -ef" on the server, it will
happily list processes until it gets to a UID it needs to query LDAP for
- and it'll hang for a second or 5 until it gets a response - and then
it'll be happy until it hits the next LDAP user, etc. I've started
doing "ps -ef &" so that I don't have a hung terminal. I'm then free to
do other things while the results slowly poke by.
Under extreme load, a few things happen:
1. The LDAP servers get a lot (1000+) simultaneous connections that do
not go away - and are connections from the mail servers. Kill -9 slapd
will release the connections, but usually corrupts the LDAP database.
Load will be 'high' on the LDAP server, but not even close to what
you'd expect for a wedging server.
2. The mail servers' load goes up to around 30, mail will stop, and the
operating system will complain about unavailble LDAP servers
3. The mail server, as a result, stops responding altogether and must be
rebooted from the power strip (the console, if you can beleive it, even
works horribly - waiting up to 10 minutes for a 'return' to actually
show up. Seriously.)
I've had it happen twice where the mail server simply died; stopped
responding on all ports (even SSH) and only pings did anything. They
still dutifully logged to the syslog server though, although it's of not
much use. Even more unbelievable, the mail servers (which are in
actuality "zones" running on the server) have managed to crash the
global zone - which isn't supposed to happen. That required a power
cycle of the whole box, which stops any other zones that happen to be on
the server. Bummer. You wouldn't think this should happen. We have
four LDAP slaves; one is an x86 dell (running solaris 10) and the others
are Netras (240) that have more than enough horsepower to handle LDAP
queries. The configuration looks fairly default, but I can't get in at
the moment to look for anything glaring. I cannot figure it all out.
Item number 2: Sendmail
Sendmail was believed to have been configured with LDAP so that sendmail
itself would query the LDAP servers for user information. However, this
turned out not to be true, as I dug into the configurations rather
deeply today and proved what I thought was the case: That sendmail was
ignoring any LDAP options in the CF file, and was simply querying the
operating system for user information (For those wondering:
LDAPDefaultSpec, ldapmh, and ldapmr(a?) were configured with hosts,
ports, and baseDNs, but no bindDN or bind password. Go figure. And no,
I didn't configure it.)
So, I've been trying to configure sendmail to be LDAP-aware - for real
this time. I have the LDAP book, the sendmail book, and the sendmail
companion book - so any questions I've had have been quickly answered
there. I have a test box set up with two LDAP zones and a sendmail zone
on it, and syslog shows that indeed, the sendmail server is querying the
ldap servers whenever it needs to look up a user that isn't local.)
My problem is that although I've proven that my test user exists in
ldap, and that a search against the LDAP database with the *exact*
filter and search terms as sendmail is *allegedly* using results in the
user's profile being pulled from LDAP, sendmail always gives me a "user
unknown".
Right now, the ldapmh and ldapmr(a?) (btw, I can't remember off the top
if there's an A in there, as I don't have the server configs or the
books in front of me - all at my desk at work) have the 'defaults',
except for my additional crap - baseDN, bindDN, host, password.
Using telnet, I try to deliver mail to the server. It takes any mail
from: user that is in the domain, but any time I try to send as a user
that exists in LDAP, it throws the user unknown error. And I don't know
if it's because the LDAP Routing maps are wrong, or if there's something
simply 'missing'. I've googled and tested for *hours* trying to see if
I had capitalization wrong, or if I had some other thing wrong, or a
config option wrong, etc - so I'm frustrated. The cause of my current
frustration is those LDAP maps - which use "%0" and "%s" in them. I can
find "documentation" on them - that is to say, examples all over the web
that talk abuot the default LDAP maps, but *nothing* that explains what
the hell %0 and %s are. I've looked through the sendmail books, and the
LDAP book, and haven't figured it out. The only thing I can think is
that %0 and %s are not what they should be - and taht's why sendmail
isn't able to find any users in LDAP.
I have sendmail loglevel all the way up (probably at 255, i can't
remember exactly right now) and LDAP is at the same level - 255. I can
watch the logs while I test commands, and I get 'fun' little errors
talking about fd 12 and errno=11, but nothing definitive. Looking up
the errors have netted me nothing useful. Despite the debug, LDAP
doesn't tell me what the queries are (it lists the baseDN, though) and
sendmail doesn't do much for me either, just that it's asking LDAP and
that the user isnt' found.
I've messed with the LDAP user's profiles, but everything looks kosher.
mailHost is the local server, (mail.domain.com), the objectClasses are
correct, the user's addresses (Routing address and localaddress or
whatever) are both the same - local to that machine. in theory,
sendmail should say, "Yes, that's a local user, let me deliver it"
but it's not working. And I cannot figure it out. I'm wondering if %0
and %s are red herrings; I can't find anything as to what those should
be, some I'm wondering if they are just the email or something listed in
the rcpt to: command.
The two problems are somewhat related - I'm wondering if sendmail
querying the LDAP servers would be 'lighter' than Solaris querying the
LDAP servers - but I need to fiture out #2 before I can really test or
prove #1.
Any ideas would be appreciated; I'm going nuts over this. It's like a
chigger under my skin.
TIA,
Adam
More information about the Members
mailing list