Unrecoverable error writing SCSI disk

Unrecoverable error writing SCSI disk

Post by Andrea Martinell » Sat, 06 Jul 1996 04:00:00



I have a serious problem with OpenServer 5.0.0Cl Enterprise.
This is my system :
        - Pentium 90Mhz - 24Mb Ram
        - Controller SCSI AHA-2940 ID=7
        - HD Quantum Empire 1080S  ID=0
        - Tape Tandberg TDC 3800   ID=2
        - Cdrom Pioneer DR-U124X   ID=4

Output from hwconfig -h :
device          address    vec  dma  comment
======          =======    ===  ===  =======
fpu                -        13   -   type=80387
serial        0x3f8-0x3ff    4   -   unit=0 type=Standard nports=1
serial        0x2f8-0x2ff    3   -   unit=1 type=Standard nports=1
floppy        0x3f2-0x3f7    6   2   unit=0 type=135ds18
console            -         -   -   unit=vga type=0 12 screens=68k
parallel      0x278-0x27a    7   -   unit=0
pci           0xcf8-0xd00    -   -   am=1 sc=1 buses=1
apm                -         -   -   PM v1.1
adapter      0xe800-0xe8ff  11   -   type=alad ha=0 bus=0 id=7 fts=sto
tape               -         -   -   type=S ha=0 id=2 lun=0 bus=0 ht=alad
disk               -         -   -   type=S ha=0 id=0 lun=0 bus=0 ht=alad
Sdsk               -         -   -   cyls=131 hds=255 secs=63 fts=stdb

During a restore this message is displayed on console :
WARNING: alad: adapter 0 Error: Target Bus Phase Sequence Error (ha=0 bus=0 id=0 lun=0)

NOTICE: Sdsk: Unrecoverable error writing SCSI disk 0 dev 1/43 (ha=0 id=0 lun=0) block=70340

WARNING: alad: adapter 0 Error: Target Bus Phase Sequence Error (ha=0 bus=0 id=0 lun=0)

NOTICE: Sdsk: Unrecoverable error writing SCSI disk 0 dev 1/42 (ha=0 id=0 lun=0) block=4186

WARNING: alad: adapter 0 Error: Target Bus Phase Sequence Error (ha=0 bus=0 id=0 lun=0)

WARNING: err: Error log overflow
....
...

NOTICE: Stp: Error on SCSI tape 0 (ha=0 id=2 lun=0)
on: Drive or bus reset

my note : The block number is not always the same and in some circumstance
          the value of block is very low, I mean  : 48,66  and so on. I checked for this
          value and I founded out that this disk block area is used for i-list !!!.

I tryed to :

 - run badtrk for fix bad blocks but no bad blocks was found.
   I can not enable AWR/ARR because controller does not support this feature.
   On reference manual I didn't found it.

 - disable syncronous mode from controller and tape because I'm not sure that
   tape supports the syncronous mode (tape is old model).

 - check the scsi termination on the last device and the controller.

 - format scsi disk from bios controller utility  and reinstall software.

 - load the newer driver (ver. 1.3) for controller.

The message is displayed also when I'm not using the tape but less frequently.

Any idea ?  

Thanks in advance.
Andrea.

 
 
 

Unrecoverable error writing SCSI disk

Post by Christian Val » Sat, 06 Jul 1996 04:00:00



We just went thru this type of problem. We did checked (change) whatever You may think of
(scsi terminator, scsi cable, scsi board, tape unit, disk, mother board, etc.). We finally decided
to put the tape on a separate controller, in occurence  AHA-1542CF and it fixes the problem.

The guess would be that, at some point, when the tape unit reset itself, it also reset the scsi
controller. Our tape backup is a QIC24 Archive 2525S

Christian Valet


Quote:>I have a serious problem with OpenServer 5.0.0Cl Enterprise.
>This is my system :
>    - Pentium 90Mhz - 24Mb Ram
>    - Controller SCSI AHA-2940 ID=7
>    - HD Quantum Empire 1080S  ID=0
>    - Tape Tandberg TDC 3800   ID=2
>    - Cdrom Pioneer DR-U124X   ID=4

>Output from hwconfig -h :
>device          address    vec  dma  comment
>======          =======    ===  ===  =======
>fpu                -        13   -   type=80387
>serial        0x3f8-0x3ff    4   -   unit=0 type=Standard nports=1
>serial        0x2f8-0x2ff    3   -   unit=1 type=Standard nports=1
>floppy        0x3f2-0x3f7    6   2   unit=0 type=135ds18
>console            -         -   -   unit=vga type=0 12 screens=68k
>parallel      0x278-0x27a    7   -   unit=0
>pci           0xcf8-0xd00    -   -   am=1 sc=1 buses=1
>apm                -         -   -   PM v1.1
>adapter      0xe800-0xe8ff  11   -   type=alad ha=0 bus=0 id=7 fts=sto
>tape               -         -   -   type=S ha=0 id=2 lun=0 bus=0 ht=alad
>disk               -         -   -   type=S ha=0 id=0 lun=0 bus=0 ht=alad
>Sdsk               -         -   -   cyls=131 hds=255 secs=63 fts=stdb

>During a restore this message is displayed on console :
>WARNING: alad: adapter 0 Error: Target Bus Phase Sequence Error (ha=0 bus=0 id=0 lun=0)

>NOTICE: Sdsk: Unrecoverable error writing SCSI disk 0 dev 1/43 (ha=0 id=0 lun=0) block=70340

>WARNING: alad: adapter 0 Error: Target Bus Phase Sequence Error (ha=0 bus=0 id=0 lun=0)

>NOTICE: Sdsk: Unrecoverable error writing SCSI disk 0 dev 1/42 (ha=0 id=0 lun=0) block=4186

>WARNING: alad: adapter 0 Error: Target Bus Phase Sequence Error (ha=0 bus=0 id=0 lun=0)

>WARNING: err: Error log overflow
>.....
>....

>NOTICE: Stp: Error on SCSI tape 0 (ha=0 id=2 lun=0)
>on: Drive or bus reset

>my note : The block number is not always the same and in some circumstance
>          the value of block is very low, I mean  : 48,66  and so on. I checked for this
>          value and I founded out that this disk block area is used for i-list !!!.

>I tryed to :

> - run badtrk for fix bad blocks but no bad blocks was found.
>   I can not enable AWR/ARR because controller does not support this feature.


 
 
 

1. NOTICE: Sdsk: Unrecoverable error reading SCSI disk 1

Some background:
It's an AcerAltos 9000Pro with PentiumPro 200mHz running OpenServer
5.0.4 with rs504c installed.

An "interesting" thing happened yesterday. All users using a specific
application got a bad case of "screen freeze". I tried and tried to
kill these pid's but they just wouldn't die (they normally go away
very quickly). The only way I could think of to get logins back to
these users was the ole reboot (I know I haven't supplied much info
here, but is there a better way to kill "unkillable" processes? BTW,
other users were running fine. System response time to console
commands was good...).

So, I log in as root on the console and enter 'shutdown -y -g0'. It
gives all the right messages, but a minute or so after the "Shutdown
proceeding..." message it returns a command prompt. I try a few more
times with the same result. I've never had a SCO machine not be able
to shutdown, this is weird to me. I also tried 'haltsys' with no luck.
It didn't ever return a command prompt, but it didn't shut anything
down either.

At wits end, I go for the dreaded power button (I know, I know). When
the system restarts it runs fsck on the root filesystem and everything
looks peachy...for awhile. Now users of this application (it's a
database application developed in house, Dataflex, if you must know)
are getting read errors. The console has several nasty messages, all
the same, as follows:

NOTICE: Sdsk: Unrecoverable error reading SCSI disk 1 dev 1/104 (ha=0
id=1 lun=0) block=2763006
    Medium error: Unrecovered read error

I go to maintenance mode and run 'fsck -ofull /dev/XX' on all
filesystems. No errors.

Can anyone recommend more steps to nail down the problem? I believe I
have isolated the problem to a specific file (the daily backup
reported a read error on this file too). Is this a sign of impending
hard disk doom or is it just a bad block or something? Any ideas would
be greatly appreciated. I'm not sure what other info you may want to
see, but I would be more than happy to post it.

Thanks in advance.

Scott Roberts

2. Solaris 7 x86 doesn't support Diamond Viper v330. Help!

3. 5.0.4 Upgrade as SCSI disks configure into kernel, flashes SCSI unrecoverable and hangs

4. REPOST + Begging: Multiple Apps on UNIX?

5. Unrecoverable Disk Error

6. 2.2.5 crashing with Kernel Debug

7. ASUS P54NP4, SCO 3.2v4.2, and unrecoverable SCSI errors

8. Real-Time Scheduler BUG??

9. SCSI disk drive write errors using Linux with an UltraStor 24F

10. SCSI disk error : host 1 channel 0 id 4 - L440GX , Adaptec AIC7xx and RedHat SCSI Errors

11. write disk error. how to scan disk?

12. SCSI disk from XENIX system with 1542 controller has strange errors on OS5 with 2940UW SCSI

13. Error message on SCSI disk, Adaptec EZ-SCSI 4.0, SCO UNIX 5.02