RAID Level 1 errors, need advice...

RAID Level 1 errors, need advice...

Post by Dan Warre » Thu, 22 Apr 1999 04:00:00



Can someone decipher these messages for me?  My RAID 1 device has worked
smoothly for 6 months but suddenly has given me errors and I don't
completely understand what they mean.  Is one of my drives failing?  Is
it just bad sectors?  Why doesn't the RAID device just mark the sectors
bad and keep going instead of disabling the drive?  Interestingly, it
seems to disable just after the SCSI tape backup starts. ???  Thank you
in advance

Dan Warren

Here is the setup:
Pentium 350 / 128 Mb SDRAM
Adaptec 2940UW
2 Seagate ST34520W 4.5Gb SCSI-2 hard drives
Red Hat 5.2 (2.0.36)
Raidtools-0.5beta1

Errors:
Apr 21 00:05:10 andromeda kernel: scsi0: MEDIUM ERROR on channel 0, id
15, lun 0
, CDB: Read (10) 00 00 41 af 45 00 00 26 00
Apr 21 00:05:10 andromeda kernel: Current error sd08:11: sense key
Medium Error
Apr 21 00:05:10 andromeda kernel: Additional sense indicates Unrecovered
read er
ror
Apr 21 00:05:10 andromeda kernel: scsidisk I/O error: dev 08:11, sector
4304682,
 absolute sector 4304745
Apr 21 00:05:10 andromeda kernel: RAID1: Disk failure on 08:11,
disabling device
.Operation continuing on 1 devices
Apr 21 00:05:10 andromeda kernel: raid1: 09:00: rescheduling block
2152341
Apr 21 00:05:10 andromeda kernel: md: updating raid superblock on device
08:01,
sb_offset == 4441856
Apr 21 00:05:10 andromeda kernel: md: updating raid superblock on device
08:11,
sb_offset == 4441856
Apr 21 00:05:10 andromeda kernel: raid1: 09:00: redirecting sector
2152341 to an
other mirror

 
 
 

RAID Level 1 errors, need advice...

Post by **Nick Brow » Thu, 22 Apr 1999 04:00:00


One of your drives has "failed" (bad blocks, possibly) and is no longer
participating in the Raid set.  In other words, you are running on one
engine.  Divert to the nearest available airfield.

The reason why it gives you the message is the same reason why a*pit
light comes on when one engine fails: because the system thinks you
might be interested in fixing the problem.

I assume you have the kernel sources.  Even if you're not a C whiz, try
looking in drivers/block/raid1.c, and look at the code (and even
comments) near to the messages which you're getting.  Another form of
documentation not available to binary-only OS users !

Good luck.  (Most of us run on one engine all the time.  We'd be in the
sea by now.)


> Can someone decipher these messages for me?  My RAID 1 device has worked
> smoothly for 6 months but suddenly has given me errors and I don't
> completely understand what they mean.  Is one of my drives failing?  Is
> it just bad sectors?  Why doesn't the RAID device just mark the sectors
> bad and keep going instead of disabling the drive?  Interestingly, it
> seems to disable just after the SCSI tape backup starts. ???  Thank you
> in advance

--
---------------------------------------------------------------
Nick Brown, Strasbourg, France (Nick(dot)Brown(at)coe(dot)fr)

Protect yourself against Word 95/97 viruses, free - check out
 http://www.veryComputer.com/
---------------------------------------------------------------