I've seen these three times over the last two weeks...
* Setup
Kernel 2.4.20 + LVM 1.0.7 (both vfs-lock and lvm patches) +
ptrace patch.
A LVM array comprising 5 SCSI discs of various sizes and makes.
This computer with this array has been working flawlessly for
over 3 years. The last disk was added to the LVM array in early
February 2003. Between February 2003 and end of April 2003, this
server has been up all the time without a single reboot.
* The problem.
1. The SCSI subsystem craps out a cryptic message:
SCSI disk error : host 0 channel 0 id 2 lun 0 return code = 10000
I/O error: dev 08:20, sector 32932328
It's always "return code = 10000" and the problem always
happens with disk 0/0/2/0.
2. Once, The SCSI subsystem queued an ABORT, failed, and the
drive "host 0 channel 0 id 2 lun 0" was taken offline.
3. LVM and ext3 crap out soon afterwards.
4. Sometimes, the crash is severe enough that the watchdog
triggers a machine reboot.
5. The drive stays stuck (not seen either in the BIOS SCSI scan
nor in the linux SCSI scan despite soft and hard resets)
until the machine is powered off and then back on.
* Questions:
- What is this error 10000 which seems to be the source of the
trouble? Is my disk dying?
- I will run more tests tonight on this disk and will report my
findings later. I intend to run a read-only badblocks, maybe
followed by a non-destructive write badblocks run (if the
read-only tests shows no problem). Any other ideas?
- Could it be a driver problem?
- And BTW, I find aic7xxx's habit of taking drives offline when
they show troubles extremely annoying: in this case it might be
justified since the drive seems to have locked up, but I've
seen it happen on CD-Rom drives which are busy retrying to read
a bad sector. Taking a CD-Rom drive offline for failing to be
responsive during I/O error recovery operation is very
heavy-handed :-)
* SCSI info:
- aic7xxx boot parameters:
aic7xxx=tag_info:{{,,,,8}}
The IBM drive has a bug with tagged queuing and anything
higher than 8 gives me trouble.
- /proc/scsi/scsi
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
Vendor: SEAGATE Model: ST336752LW Rev: 0004
Type: Direct-Access ANSI SCSI revision: 03
Host: scsi0 Channel: 00 Id: 01 Lun: 00
Vendor: SEAGATE Model: ST336752LW Rev: 0002
Type: Direct-Access ANSI SCSI revision: 03
Host: scsi0 Channel: 00 Id: 02 Lun: 00
Vendor: SEAGATE Model: ST318451LW Rev: 0003
Type: Direct-Access ANSI SCSI revision: 03
Host: scsi0 Channel: 00 Id: 03 Lun: 00
Vendor: SEAGATE Model: ST318451LW Rev: 0002
Type: Direct-Access ANSI SCSI revision: 03
Host: scsi0 Channel: 00 Id: 04 Lun: 00
Vendor: IBM Model: DRVS18V Rev: 0140
Type: Direct-Access ANSI SCSI revision: 03
Host: scsi0 Channel: 00 Id: 05 Lun: 00
Vendor: SEAGATE Model: ST39103LW Rev: 0002
Type: Direct-Access ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 06 Lun: 00
Vendor: PIONEER Model: CD-ROM DR-U24X Rev: 1.01
Type: CD-ROM ANSI SCSI revision: 02
- /proc/scsi/aic7xxx/0
Adaptec AIC7xxx driver version: 6.2.8
aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
Serial EEPROM:
0xc33a 0xc33a 0xc33a 0xc33a 0xc33a 0xc33a 0xc33a 0xc33a
0xc33a 0xc33a 0xc33a 0xc33a 0xc33a 0xc33a 0xc33a 0xc33a
0xb8f5 0x7c5d 0x2807 0x0010 0x0300 0xffff 0xffff 0xffff
0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0x0250 0x9650
Channel A Target 0 Negotiation Settings
User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Goal: 160.000MB/s transfers (80.000MHz DT, offset 63, 16bit)
Curr: 160.000MB/s transfers (80.000MHz DT, offset 63, 16bit)
Channel A Target 0 Lun 0 Settings
Commands Queued 5822349
Commands Active 0
Command Openings 35
Max Tagged Openings 253
Device Queue Frozen Count 0
Channel A Target 1 Negotiation Settings
User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Goal: 160.000MB/s transfers (80.000MHz DT, offset 63, 16bit)
Curr: 160.000MB/s transfers (80.000MHz DT, offset 63, 16bit)
Channel A Target 1 Lun 0 Settings
Commands Queued 49712
Commands Active 0
Command Openings 45
Max Tagged Openings 253
Device Queue Frozen Count 0
Channel A Target 2 Negotiation Settings
User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Goal: 160.000MB/s transfers (80.000MHz DT, offset 63, 16bit)
Curr: 160.000MB/s transfers (80.000MHz DT, offset 63, 16bit)
Channel A Target 2 Lun 0 Settings
Commands Queued 31535
Commands Active 0
Command Openings 35
Max Tagged Openings 253
Device Queue Frozen Count 0
Channel A Target 3 Negotiation Settings
User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Goal: 160.000MB/s transfers (80.000MHz DT, offset 63, 16bit)
Curr: 160.000MB/s transfers (80.000MHz DT, offset 63, 16bit)
Channel A Target 3 Lun 0 Settings
Commands Queued 13690
Commands Active 0
Command Openings 253
Max Tagged Openings 253
Device Queue Frozen Count 0
Channel A Target 4 Negotiation Settings
User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Goal: 80.000MB/s transfers (40.000MHz, offset 15, 16bit)
Curr: 80.000MB/s transfers (40.000MHz, offset 15, 16bit)
Channel A Target 4 Lun 0 Settings
Commands Queued 12479
Commands Active 0
Command Openings 8
Max Tagged Openings 8
Device Queue Frozen Count 0
Channel A Target 5 Negotiation Settings
User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Goal: 80.000MB/s transfers (40.000MHz, offset 15, 16bit)
Curr: 80.000MB/s transfers (40.000MHz, offset 15, 16bit)
Channel A Target 5 Lun 0 Settings
Commands Queued 2678
Commands Active 0
Command Openings 253
Max Tagged Openings 253
Device Queue Frozen Count 0
Channel A Target 6 Negotiation Settings
User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Goal: 10.000MB/s transfers (10.000MHz, offset 8)
Curr: 10.000MB/s transfers (10.000MHz, offset 8)
Channel A Target 6 Lun 0 Settings
Commands Queued 5
Commands Active 0
Command Openings 1
Max Tagged Openings 0
Device Queue Frozen Count 0
Channel A Target 7 Negotiation Settings
User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 8 Negotiation Settings
User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 9 Negotiation Settings
User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 10 Negotiation Settings
User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 11 Negotiation Settings
User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 12 Negotiation Settings
User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 13 Negotiation Settings
User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 14 Negotiation Settings
User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 15 Negotiation Settings
User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
* Log files
- First incident:
SCSI disk error : host 0 channel 0 id 2 lun 0 return code = 10000
I/O error: dev 08:20, sector 12052208
I/O error: dev 08:20, sector 12508912
I/O error: dev 08:20, sector 10387952
EXT3-fs error (device lvm(58,5)): ext3_get_inode_loc: unable to read inode block - inode=6717643, block=13434894
Aborting journal on device lvm(58,5).
Remounting filesystem read-only
I/O error: dev 08:20, sector 8815056
EXT3-fs error (device lvm(58,5)): ext3_get_inode_loc: unable to read inode block - inode=6619207, block=13238282
I/O error: dev 08:20, sector 10387952
EXT3-fs error (device lvm(58,5)): ext3_get_inode_loc: unable to read inode block - inode=6717649, block=13434894
I/O error: dev
...
read more »