sata problem / "pdc-ultra: [warning] disk9 ATA timeout"

sata problem / "pdc-ultra: [warning] disk9 ATA timeout"

Post by gary arti » Sat, 13 Mar 2004 04:15:03



Hi,

I have 2 promise 150 SATA cards (4x),
with 3 maxtor drives (250/GB) on each (6 total).
I'm running SuSe 9.0 ( 2.4.21-192-smp4G ).

Problem: periodically (once every 1-2 months) my system hangs,
the last console messages I see are:

Mar 11 04:17:02 mysystem kernel: pdc-ultra:[warning] disk9 ATA timeoutat LBA 0x1826b047
Mar 11 04:17:02 mysystem kernel: pdc-ultra:[warning] submit cam busy
Mar 11 04:18:00 mysystem kernel: scsi : aborting command due to timeout :
pid 38985860, scsi2, channel 0, id 0, lun 0 Read (10) 00 18 26 b0 47 00 00 08 00
Mar 11 04:18:00 mysystem kernel: pdc-ultra:[info] scsi abort success
Mar 11 04:18:08 mysystem kernel: pdc-ultra:[warning] disk9 ATA timeoutat LBA 0x18feb047
Mar 11 04:18:08 mysystem kernel: pdc-ultra:[warning] submit cam busy
Mar 11 04:19:05 mysystem kernel: scsi : aborting command due to timeout :
pid 38987973, scsi2, channel 0, id 0, lun 0 Read (10) 00 18 fe b0 47 00 00 08 00
Mar 11 04:19:05 mysystem kernel: pdc-ultra:[info] scsi abort success
Mar 11 04:19:18 mysystem kernel: pdc-ultra:[warning] disk9 ATA timeoutat LBA 0x128d5af7
Mar 11 04:19:18 mysystem kernel: pdc-ultra:[warning] submit cam busy
Mar 11 04:20:16 mysystem kernel: scsi : aborting command due to timeout :
pid 38989382, scsi2, channel 0, id 0, lun 0 Read (10) 00 12 8d 5a f7 00 00 08 00
Mar 11 04:20:16 mysystem kernel: pdc-ultra:[info] scsi abort success
Mar 11 04:20:33 mysystem kernel: pdc-ultra:[warning] disk9 ATA timeoutat LBA 0x1af405a7
Mar 11 04:20:33 mysystem kernel: pdc-ultra:[warning] submit cam busy
Mar 11 04:21:31 mysystem kernel: scsi : aborting command due to timeout :
pid 38991681, scsi2, channel 0, id 0, lun 0 Read (10) 00 1a f4 05 a7 00 00 08 00
Mar 11 04:21:31 mysystem kernel: pdc-ultra:[info] scsi abort success
Mar 11 04:22:05 mysystem kernel: pdc-ultra:[warning] disk9 ATA timeoutat LBA 0x162005c7
Mar 11 04:22:05 mysystem kernel: pdc-ultra:[warning] submit cam busy
Mar 11 04:23:03 mysystem kernel: scsi : aborting command due to timeout :
pid 38995650, scsi2, channel 0, id 0, lun 0 Read (10) 00 16 20 05 c7 00 00 08 00
Mar 11 04:23:03 mysystem kernel: pdc-ultra:[info] scsi abort success
Mar 11 04:23:50 mysystem kernel: pdc-ultra:[warning] disk9 ATA timeoutat LBA 0x1c4005e7
Mar 11 04:23:50 mysystem kernel: pdc-ultra:[warning] submit cam busy
Mar 11 04:24:49 mysystem kernel: scsi : aborting command due to timeout :
pid 39002296, scsi2, channel 0, id 0, lun 0 Read (10) 00 1c 40 05 e7 00 00 08 00
Mar 11 04:24:49 mysystem kernel: pdc-ultra:[info] scsi abort success
Mar 11 04:27:05 mysystem kernel: pdc-ultra:[warning] disk9 ATA timeoutat LBA 0x194006df
Mar 11 04:27:05 mysystem kernel: pdc-ultra:[warning] submit cam busy
Mar 11 04:28:04 mysystem kernel: scsi : aborting command due to timeout :
pid 39020967, scsi2, channel 0, id 0, lun 0 Read (10) 00 19 40 06 df 00 00 08 00
Mar 11 04:28:04 mysystem kernel: pdc-ultra:[info] scsi abort success
Mar 11 04:32:24 mysystem kernel: pdc-ultra:[warning] disk9 ATA timeoutat LBA 0x1146aaef
Mar 11 04:32:24 mysystem kernel: pdc-ultra:[warning] submit cam busy
Mar 11 04:33:23 mysystem kernel: scsi : aborting command due to timeout :
pid 39054798, scsi2, channel 0, id 0, lun 0 Read (10) 00 11 46 aa ef 00 00 08 00
Mar 11 04:33:23 mysystem kernel: pdc-ultra:[info] scsi abort success
Mar 11 04:33:24 mysystem kernel: pdc-ultra:[warning] disk9 ATA timeoutat LBA 0x1146aaef
Mar 11 04:33:24 mysystem kernel: pdc-ultra:[warning] submit cam busy
Mar 11 04:34:23 mysystem kernel: scsi : aborting command due to timeout :
pid 39055439, scsi2, channel 0, id 0, lun 0 Read (10) 00 11 46 aa ef 00 00 08 00
Mar 11 04:34:23 mysystem kernel: pdc-ultra:[info] scsi abort success
Mar 11 04:34:24 mysystem kernel: pdc-ultra:[warning] disk9 ATA timeoutat LBA 0x1146aaef
Mar 11 04:34:24 mysystem kernel: pdc-ultra:[warning] submit cam busy
Mar 11 04:35:23 mysystem kernel: scsi : aborting command due to timeout :
pid 39055440, scsi2, channel 0, id 0, lun 0 Read (10) 00 11 46 aa ef 00 00 08 00
Mar 11 04:35:23 mysystem kernel: pdc-ultra:[info] scsi abort success
Mar 11 04:35:23 mysystem kernel: scsi2 channel 0 : resetting for second half of retries.
Mar 11 04:35:23 mysystem kernel: SCSI bus is being reset for host 2 channel 0.
Mar 11 04:35:24 mysystem kernel: pdc-ultra:[warning] scsi reset channel5 OK
Mar 11 04:35:27 mysystem kernel: pdc-ultra:[warning] disk9 ATA timeoutat LBA 0x1146aaef
Mar 11 04:35:27 mysystem kernel: pdc-ultra:[warning] submit cam busy

(hung)

Not sure were to look for a solution. The problem occurred before and I
replaced the disk it was pointing to. I guess it's possible I have another
bad disk, but am suspicious. Any help is Greatly appreciated..

Gary

 
 
 

sata problem / "pdc-ultra: [warning] disk9 ATA timeout"

Post by Davide Bianch » Sat, 13 Mar 2004 04:39:04



> pid 38985860, scsi2, channel 0, id 0, lun 0 Read (10) 00 18 26 b0 47 00 00 08 00
> pid 38987973, scsi2, channel 0, id 0, lun 0 Read (10) 00 18 fe b0 47 00 00 08 00
> pid 38989382, scsi2, channel 0, id 0, lun 0 Read (10) 00 12 8d 5a f7 00 00 08 00

It looks like one of your disk is dying.

Quote:> Not sure were to look for a solution. The problem occurred before and I
> replaced the disk it was pointing to.

Maybe is the controller or the cable, can you replace it just to test?

Davide

--
| You probably wouldn't worry about what people think of you if you
| could know how seldom they do.   -- Olin Miller.
|
|

 
 
 

sata problem / "pdc-ultra: [warning] disk9 ATA timeout"

Post by gary arti » Sat, 13 Mar 2004 04:49:36


Thanks for the quick reply. see below...



>> pid 38985860, scsi2, channel 0, id 0, lun 0 Read (10) 00 18 26 b0 47 00 00 08 00
>> pid 38987973, scsi2, channel 0, id 0, lun 0 Read (10) 00 18 fe b0 47 00 00 08 00
>> pid 38989382, scsi2, channel 0, id 0, lun 0 Read (10) 00 12 8d 5a f7 00 00 08 00

Are you fairly certain (via codes returned)?

Quote:> It looks like one of your disk is dying.

>> Not sure were to look for a solution. The problem occurred before and I
>> replaced the disk it was pointing to.

> Maybe is the controller or the cable, can you replace it just to test?

In the past I tried swapping the controllers but the problem did _not_
follow. I didn't change sata cables thou...
Quote:

> Davide

 
 
 

sata problem / "pdc-ultra: [warning] disk9 ATA timeout"

Post by Davide Bianch » Sat, 13 Mar 2004 05:30:49



> Are you fairly certain (via codes returned)?

Well, when you keep receiving errors always from the same disk
(scsi id, lun etc. are the same), it's usually a problem with the
disk, but if you changed the disk...

Quote:> In the past I tried swapping the controllers but the problem did _not_
> follow. I didn't change sata cables thou...

Then maybe is just a cable.

Davide

--
| "There are two ways of disliking poetry; one way is to dislike it, the
| other is to read Pope."   -- Oscar Wilde
|
|

 
 
 

sata problem / "pdc-ultra: [warning] disk9 ATA timeout"

Post by gary arti » Sat, 13 Mar 2004 05:35:11




>> Are you fairly certain (via codes returned)?

> Well, when you keep receiving errors always from the same disk
> (scsi id, lun etc. are the same), it's usually a problem with the
> disk, but if you changed the disk...

I'll give it a try. Just seems strange that 2 disks on the same controller
would fail within a 2-3 month period. I thought it could be power supply
related, since I have 7 disk installed. I'll try replacing the cable
first, then the disk. If you have any other analysis tools you recommend
let me know...Thanks for your help!
Quote:

>> In the past I tried swapping the controllers but the problem did _not_
>> follow. I didn't change sata cables thou...

> Then maybe is just a cable.

> Davide