new aic7xxx bug, 2.4.13/6.2.4

new aic7xxx bug, 2.4.13/6.2.4

Post by Justin T. Gibb » Sun, 04 Nov 2001 00:00:12




>> I'm having troubles with the last few revisions of the new Gibbs aic7xxx
>> driver that until now has served me well. This is kernel
>> 2.4.13-preempt-lvmrc4 with the 6.2.4 scsi driver, but I saw it happen on
>> 2.4.12 with 6.2.1. I haven't tried older versions with this CD.

>Something similar here, but I have disks on this SCSI :-(

Actually not that similar.

Quote:>scsi0:A:0: parity error detected in Message-in phase. SEQADDR(0x1b6) SCSIRATE(
>0x0)

Here's your first clue.  Parity errors occur on "broken" SCSI busses.
Your cabling or terminator is bad, you have a bent pin, or just too
much noise in and around your cabling.

Quote:>scsi0: Unexpected busfree in Message-out phase
>SEQADDR == 0x166

When we tried to tell the target about the parity error during message-in
phase, the target was unable to see our message without parity errors.
So, it had to go bus free to terminate the transaction.  It is a bug
here that we didn't properly identify the active transaction and abort
it here due to some overly protective code, but that can be readily
fixed.

Quote:>scsi0:0:0:0: Attempting to queue an ABORT message

Because we failed to abort the command above, the mid-layer
timesout the command and we start recovery.

Quote:>scsi0: Dumping Card State while idle, at SEQADDR 0x7
>ACCUM = 0x33, SINDEX = 0x7, DINDEX = 0x8c, ARG_2 = 0x0
>HCNT = 0x0
>SCSISEQ = 0x12, SBLKCTL = 0x2
> DFCNTRL = 0x0, DFSTATUS = 0x29
>LASTPHASE = 0x1, SCSISIGI = 0x0, SXFRCTL0 = 0x80
>SSTAT0 = 0x5, SSTAT1 = 0xa
>STACK == 0x3, 0x105, 0x160, 0xe4
>SCB count = 20
>Kernel NEXTQSCB = 19
>Card NEXTQSCB = 19
>QINFIFO entries:  
>Waiting Queue entries:  
>Disconnected Queue entries: 7:5  
>QOUTFIFO entries:  
>Sequencer Free SCB List: 8 11 4 5 10 15 2 3 1 0 9 13 14 6 12  
>Pending list: 5
>Kernel Free SCB list: 7 12 0 4 10 6 9 14 3 1 8 11 15 13 2 18 17 16  
>DevQ(0:0:0): 0 waiting
>DevQ(0:1:0): 0 waiting
>(scsi0:A:0:0): Queuing a recovery SCB
>scsi0:0:0:0: Device is disconnected, re-queuing SCB
>Recovery code sleeping
>(scsi0:A:0:0): Abort Tag Message Sent
>(scsi0:A:0:0): SCB 5 - Abort Tag Completed.
>Recovery SCB completes
>Recovery code awake
>aic7xxx_abort returns 0x2002

And we abort the transaction successfully.

Quote:>scsi0:A:0: parity error detected in Message-in phase. SEQADDR(0x1b6) SCSIRATE(
>0x0)
>Kernel panic: HOST_MSG_LOOP with invalid SCB ff

We get another parity error in a state without a proper connection (probably
on the identify message after a reselection) and some more defensive
programming prevents us from handling it correctly. 8-(

The end result is that you need to fix your SCSI bus to be more reliable.
I will fix the issues you've discovered in parity error and unexpected
busfree handling in the next driver release.  If your bus is working
properly, you shouldn't see these errors.

--
Justin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

1. new aic7xxx bug, 2.4.13/6.2.4

[...]

FYI: I've found a similar result under totally different conditions:
2.4.13-ac7
Vendor: TOSHIBA  Model: DVD-ROM SD-M1502 Rev: 1012
Type:   CD-ROM                           ANSI SCSI revision: 02
via ide-scsi

cdrdao read-cd --device /dev/sr0 --driver generic-mmc --buffers 80 -n
--eject --paranoia-mode 0 toc
[...]
?: Input/output error.  : scsi sendcmd: retryable error
CDB:  BE 00 00 04 2C 67 00 00 1A F8 01 00
status: 0x0 (GOOD STATUS)
cmd finished after 20.101s timeout 20s
?: Input/output error.  : scsi sendcmd: retryable error
CDB:  BE 00 00 04 2E 43 00 00 1A F8 01 00
status: 0x0 (GOOD STATUS)
cmd finished after 20.101s timeout 20s
[...]
killed with ^c

locked the drive completely. Need to reboot to eject the cd...
I suspect some bad interference between DVD firmware, kernel
SCSI error handling and cdrdao. A plextor reader finally
succeeded on this job (wink :)

Maybe you're barking up the wrong tree.

Hans-Peter

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

2. Install system with serial terminal but no video card ??

3. aic7xxx freezes with kernel 2.4.13

4. 3310 SCSI host channel operating in asynch mode and narrow width

5. aic7xxx freezes with kernel 2.4.13 - solved

6. icmp in ipchains

7. Kernel panic: Loop 1 (aic7xxx under 2.4.13-ac[246])

8. Linux on Decstation 5000/240?

9. aic7xxx freezes with kernel 2.4.13

10. BUG() in asm/pci.h:142 with 2.4.13

11. Maybe IPv6 Routing Bug in 2.4.13 (::/0 doesnt forward)

12. installing new kernel 2.4.13

13. Error in ARM configuration (2.4.13)