[KLUG Members] IBM ServeRAID problems

Mike Slack members@kalamazoolinux.org
Wed, 28 Nov 2001 20:46:22 -0800


I am getting some strange behavior from my IBM ServeRAID card.  Every once in a while the box locks up and filesystems on the raid channels become essentially inaccessible.  It seems to happen most frequently when I am using the DAT drive (also SCSI, but on a different, on-board SCSI channel), or starting up VMWare while other disk I/O intensive things are happening.

Has anyone seen this kind of problem before?  Is this more likely a hardware or (ips) driver problem?  Or something else, like a SCSI termination problem?

TIA,
Mike

-- 
Mike Slack
mike@slacking.org
--
"If we knew what it was we were doing, it wouldn't
be called research, would it?" --Albert Einstein


Here is some system info:

RH 7.2, Kernel 2.4.9-13

-----
$ cat /proc/scsi/ips/1

IBM ServeRAID General Information:

        Controller Type                   : ServeRAID
        IO region                         : 0xe400 (256 bytes)
        Memory region                     : 0xde000 (8192 bytes)
        Shared memory address             : 0xc00de000
        IRQ number                        : 10
        BIOS Version                      : 4.80.26
        Firmware Version                  : 2.25.01
        Boot Block Version                :  96304
        Driver Version                    : 4.80.26 
        Max Physical Devices              : 45
        Max Active Commands               : 32
        Current Queued Commands           : 0
        Current Active Commands           : 0
        Current Queued PT Commands        : 0
        Current Active PT Commands        : 0

-----

$ cat /proc/scsi/scsi 
Attached devices: 
Host: scsi0 Channel: 00 Id: 01 Lun: 00
  Vendor: QUANTUM  Model: QM39100TD-SW     Rev: N491
  Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 02 Lun: 00
  Vendor: UMAX     Model: Astra 1200S      Rev: V2.9
  Type:   Scanner                          ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 03 Lun: 00
  Vendor: YAMAHA   Model: CRW4416S         Rev: 1.0f
  Type:   CD-ROM                           ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 04 Lun: 00
  Vendor: HP       Model: HP35480A         Rev: T503
  Type:   Sequential-Access                ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 05 Lun: 00
  Vendor: IOMEGA   Model: ZIP 100          Rev: J.03
  Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi1 Channel: 00 Id: 00 Lun: 00
  Vendor: IBM      Model: SERVERAID        Rev: 1.00
  Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi1 Channel: 00 Id: 01 Lun: 00
  Vendor: IBM      Model: SERVERAID        Rev: 1.00
  Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi1 Channel: 00 Id: 02 Lun: 00
  Vendor: IBM      Model: SERVERAID        Rev: 1.00
  Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi1 Channel: 00 Id: 15 Lun: 00
  Vendor: IBM      Model: SERVERAID        Rev: 1.00
  Type:   Processor                        ANSI SCSI revision: 02

-----

An excerpt from /var/log/messages:

Nov 28 18:50:50 linus kernel: st0: Error with sense data: Current st09:00: sense key Unit Attention
Nov 28 18:50:50 linus kernel: Additional sense indicates Not ready to ready change,medium may have changed
Nov 28 18:50:52 linus sshd(pam_unix)[2923]: session closed for user bhettiger
Nov 28 18:55:35 linus kernel: st0: Error with sense data: Current st09:00: sense key Unit Attention
Nov 28 18:55:35 linus kernel: Additional sense indicates Not ready to ready change,medium may have changed
Nov 28 18:58:32 linus kernel: st0: Error with sense data: Current st09:00: sense key Unit Attention
Nov 28 18:58:32 linus kernel: Additional sense indicates Not ready to ready change,medium may have changed
Nov 28 19:09:41 linus kernel: (ips0) ips_issue val [0x101a].
Nov 28 19:09:41 linus kernel: (ips0) ips_issue semaphore chk timeout.
Nov 28 19:09:41 linus kernel: (ips0) ips_issue val [0x101a].
Nov 28 19:09:41 linus kernel: (ips0) ips_issue semaphore chk timeout.
Nov 28 19:09:41 linus kernel: (ips0) ips_issue val [0x101a].
Nov 28 19:09:41 linus kernel: (ips0) ips_issue semaphore chk timeout.
Nov 28 19:09:41 linus kernel: (ips0) ips_issue val [0x101a].
Nov 28 19:09:41 linus kernel: (ips0) ips_issue semaphore chk timeout.
Nov 28 19:09:41 linus kernel: (ips0) ips_issue val [0x101a].
Nov 28 19:09:41 linus kernel: (ips0) ips_issue semaphore chk timeout.
Nov 28 19:09:41 linus kernel: SCSI disk error : host 1 channel 0 id 0 lun 0 return code = 70000
Nov 28 19:09:41 linus kernel:  I/O error: dev 08:21, sector 0
Nov 28 19:09:41 linus kernel: (ips0) ips_issue val [0x101a].
Nov 28 19:09:41 linus kernel: (ips0) ips_issue semaphore chk timeout.
Nov 28 19:09:41 linus kernel: (ips0) ips_issue val [0x101a].
Nov 28 19:09:41 linus kernel: (ips0) ips_issue semaphore chk timeout.
Nov 28 19:09:41 linus kernel: (ips0) ips_issue val [0x101a].
Nov 28 19:09:41 linus kernel: (ips0) ips_issue semaphore chk timeout.
Nov 28 19:09:41 linus kernel: (ips0) ips_issue val [0x101a].
Nov 28 19:09:41 linus kernel: (ips0) ips_issue semaphore chk timeout.
Nov 28 19:09:41 linus kernel: (ips0) ips_issue val [0x101a].
Nov 28 19:09:41 linus kernel: (ips0) ips_issue semaphore chk timeout.
Nov 28 19:09:41 linus kernel: (ips0) ips_issue val [0x101a].
Nov 28 19:09:41 linus kernel: (ips0) ips_issue semaphore chk timeout.
Nov 28 19:09:41 linus kernel: (ips0) ips_issue val [0x101a].
Nov 28 19:09:41 linus kernel: (ips0) ips_issue semaphore chk timeout.
Nov 28 19:09:41 linus kernel: (ips0) ips_issue val [0x101a].
Nov 28 19:09:41 linus kernel: (ips0) ips_issue semaphore chk timeout.
Nov 28 19:09:41 linus kernel: (ips0) ips_issue val [0x101a].
Nov 28 19:09:41 linus kernel: (ips0) ips_issue semaphore chk timeout.
Nov 28 19:09:41 linus kernel: (ips0) ips_issue val [0x101a].
Nov 28 19:09:41 linus kernel: (ips0) ips_issue semaphore chk timeout.
Nov 28 19:09:41 linus kernel: (ips0) ips_issue val [0x101a].
Nov 28 19:09:41 linus kernel: (ips0) ips_issue semaphore chk timeout.
Nov 28 19:09:41 linus kernel: (ips0) ips_issue val [0x101a].
Nov 28 19:09:41 linus kernel: (ips0) ips_issue semaphore chk timeout.
Nov 28 19:09:41 linus kernel: (ips0) ips_issue val [0x101a].
Nov 28 19:09:41 linus kernel: (ips0) ips_issue semaphore chk timeout.
Nov 28 19:09:41 linus kernel: (ips0) ips_issue val [0x101a].
Nov 28 19:09:41 linus kernel: (ips0) ips_issue semaphore chk timeout.
Nov 28 19:09:41 linus kernel: (ips0) ips_issue val [0x101a].
Nov 28 19:09:41 linus kernel: (ips0) ips_issue semaphore chk timeout.
Nov 28 19:09:41 linus kernel: (ips0) ips_issue val [0x101a].
Nov 28 19:09:42 linus kernel: (ips0) ips_issue semaphore chk timeout.
Nov 28 19:09:42 linus kernel: (ips0) ips_issue val [0x101a].
Nov 28 19:09:42 linus kernel: (ips0) ips_issue semaphore chk timeout.
Nov 28 19:09:42 linus kernel: (ips0) ips_issue val [0x101a].
Nov 28 19:09:42 linus kernel: (ips0) ips_issue semaphore chk timeout.
Nov 28 19:09:42 linus kernel: (ips0) ips_issue val [0x101a].
Nov 28 19:09:42 linus kernel: (ips0) ips_issue semaphore chk timeout.
Nov 28 19:09:42 linus kernel: (ips0) ips_issue val [0x101a].
Nov 28 19:09:42 linus kernel: (ips0) ips_issue semaphore chk timeout.
Nov 28 19:09:42 linus kernel: SCSI disk error : host 1 channel 0 id 0 lun 0 return code = 70000
Nov 28 19:09:42 linus kernel:  I/O error: dev 08:21, sector 8
Nov 28 19:09:42 linus kernel: SCSI disk error : host 1 channel 0 id 0 lun 0 return code = 70000
Nov 28 19:09:42 linus kernel:  I/O error: dev 08:21, sector 524408
Nov 28 19:09:42 linus kernel: SCSI disk error : host 1 channel 0 id 0 lun 0 return code = 70000
Nov 28 19:09:42 linus kernel:  I/O error: dev 08:21, sector 2883584
Nov 28 19:09:42 linus kernel: SCSI disk error : host 1 channel 0 id 0 lun 0 return code = 70000
Nov 28 19:09:42 linus kernel:  I/O error: dev 08:21, sector 2905104
Nov 28 19:10:35 linus kernel: (ips0) Resetting controller.
Nov 28 19:11:45 linus kernel: scsi: device set offline - not ready or command retry failed after host reset: host 1 channel 0 id 0 lun 0
Nov 28 19:11:55 linus kernel: scsi: device set offline - not ready or command retry failed after host reset: host 1 channel 0 id 0 lun 0
Nov 28 19:11:55 linus kernel: (ips0) ips_issue val [0x101a].
Nov 28 19:11:55 linus kernel: (ips0) ips_issue semaphore chk timeout.
Nov 28 19:11:55 linus kernel: scsi: device set offline - not ready or command retry failed after host reset: host 1 channel 0 id 0 lun 0
Nov 28 19:11:55 linus kernel: SCSI disk error : host 1 channel 0 id 0 lun 0 return code = 70000
Nov 28 19:11:55 linus kernel:  I/O error: dev 08:21, sector 14012096
Nov 28 19:11:55 linus kernel:  I/O error: dev 08:21, sector 14012104
Nov 28 19:11:55 linus kernel:  I/O error: dev 08:21, sector 14012224
Nov 28 19:11:55 linus kernel:  I/O error: dev 08:21, sector 14012352
Nov 28 19:11:55 linus kernel:  I/O error: dev 08:21, sector 14012096
Nov 28 19:11:55 linus kernel:  I/O error: dev 08:21, sector 14120960
Nov 28 19:11:55 linus kernel:  I/O error: dev 08:21, sector 14121024
Nov 28 19:11:56 linus last message repeated 7 times
Nov 28 19:11:56 linus kernel:  I/O error: dev 08:21, sector 14127624
Nov 28 19:11:56 linus kernel:  I/O error: dev 08:21, sector 14127688
Nov 28 19:11:56 linus last message repeated 7 times
Nov 28 19:11:56 linus kernel:  I/O error: dev 08:21, sector 14127848
Nov 28 19:11:56 linus kernel:  I/O error: dev 08:21, sector 14127912