[KLUG Members] IBM ServeRAID problems

Adam Williams members@kalamazoolinux.org
19 Dec 2001 13:22:15 -0500


>This is a response to a response I got about 3 weeks ago.  I've had no time
>to look into the details until very recently. I checked /proc/pci and
>/proc/interrupts and noticed that the onboard scsi controller and the
>ServeRAID controller share an IRQ (this is after various fiddlings with
>the BIOS):
>$ cat /proc/interrupts
>           CPU0       
>  0:    4718026          XT-PIC  timer
>  1:       1821          XT-PIC  keyboard
>  2:          0          XT-PIC  cascade
>  5:          0          XT-PIC  es1370
>  6:         43          XT-PIC  floppy
>  8:     143799          XT-PIC  rtc
> 10:      18285          XT-PIC  eth0
> 11:      55523          XT-PIC  aic7xxx, ips
> 12:       4969          XT-PIC  PS/2 Mouse
>NMI:          0 
>ERR:          0
>(It doesn't look like there are any memory address conflicts of any kind). 

Ok,  nothing falls close?  My memory conflict was non-obvious.  The
cyclades port controller and the IPS had diffrent **base** addresses but
the cyclades I/O slice was 16k, so it **overlapped** onto the IPS's
address space.  Took some very nasty BIOS fiddleing to get around it.  

Don't you LOVE the Intel/PC platform?  So incredibly robust and well
designed.....

>I read somewhere that a shared IRQ might cause hangs (even though in theory 
>it shouldn't; the problems I am having always occur during I/O

Your right,  PCI should be able to share IRQs.  Depending upon chipsets
and how the specific drivers are coded, in actuality, it may or may nor
work.  It also doesn't guarantee the best performance.

>intensive operations like tape backups or opening large files, etc.). 
>The root partition is mounted on a disk on the onboard controller, and
>all other mounts (including /home) are on the ips controller.  Most
>peripherals (including tape, CDROM) are on the onboard controller.  So
>it looks to me like this is a good candidate for the source of the
>problems (they always occur when both controllers are doing intensive
>operations).  Unfortunately, I haven't been able to test my theory,

Yep,  you probably develope a nice race condition on a spin-lock/mutex
somewhere.

>since I can't seem to set separate IRQs for aic7xxx and ips no matter
>what I do.  I can set the ips IRQ from the BIOS, but whatever I set it
>to, aic7xxx follows (or doesn't get loaded at all).  Am I missing
>something here?  Or is this even something I should be trying?  Any
>other suggestions?

Have to tried physically moving the card to a different PCI slot? (I'm
serious).  How motherbords allocate resources seems to vary widely. 
Your "on board" SCSI controller may simply be a PCI device.  My
motherboard seems almost to assign certain IRQs to certain slots no
matter what is stuck in them, if a previous one is empty, etc....  Also
some PCI slots can't bus-master,  which you really want on a SCSI and
network controller.  I've noticed the slots nearer to the "top" of the
motherboard (away from any ISA slots) seem to have a "higher" priority.
I'm no expert on PCI (by any stretch of the imagination) but I *know* I
have fixed weird problems by just moving the cards.