[KLUG Members] Attempting to fix a server
Adam Bultman
adamb at glaven.org
Wed Aug 25 17:17:31 EDT 2004
Last week, I had a server tip over on me. I believe that the SCSI card
had pretty much up and died, and as a result we had some filesystem
corruption, and the inability to boot.
At the time, the system would 'start' to boot, and then reboot when it
started getting too far, and start over again.
Since then, the SCSI card has been replaced, and we are working on
mopping things up. So far, I've fixed the filesystems, made sure that
things are bootable, and been trying to get things to boot normally.
At one point, upon login, it would fail when you tried to log in - the
error was something to the tune of 'unable to open the password store'.
However, since then the error has gone away, and now, the system will
boot most of the way, then fail. It either will boot to a login prompt,
but fail any attempts at login, will start booting and freeze with a
getpwnam error, or get kinda confused on bootup, depending on the kernel
you choose.
The only thing I can get it to do with any regularity is to boot into
single user mode. in single, I can check the filesystems, make sure
things are all there, and I've removed any scripts that I don't need
running that have been causing problems (nfs, nfs mounts in fstab,
network) but I can't seem to crack this nut. I can't figure out why it
will boot, but not have a list of usernames - and kind of fail
silently. I'm guessing that my /etc/passwd is tanked, and that I'll
need to rebuild it somehow - does that sound like a reasonable guess?
The server's purpose has since been usurped by another, but I'd like to
find out what went wrong on this thing, so at least I know what to check
for in the future.
Adam
More information about the Members
mailing list