Cause of HD Block Corruption

Cause of HD Block Corruption

Post by Layne Z » Thu, 12 Jun 1997 04:00:00



I need some help.

Twice now (last night and in early May) I've had problems with a block
on my HD going 'bad'.

Been widely scattered blocks, but both times they've been in the inode
tables of one of the 13 groups that my root partition is divided into.
This simultaneously screws up the reading of several files and has
resulted in kernel panics when I try to boot my machine.

The specific error mssg.s have been:
  "hda: read-intr: status=0x59  {DriveReady SeekComplete DataRequest
Error}
   hda: read-intr: error=0x40  {UncorrectableError} LBAsect= 98407
sector=98344  (these two numbers have varied)
   end request: I/O error, dev 03:01, sector 98344
   Kernel panic: ExT2-fs panic (device 03:01): ext2_read_inode: unable
to read i-node block - inode=12027 block=49172  (these two numbers have
varied)

Given the bad block is in the inode table itself, I can't seem to repair
the problem in-situ.  I have been able to effect repairs by
  - boot with emergency boot disk
  - mount /dev/hda1  and  /dev/hda2
  - run e2fsck on the damaged /dev/hda1 - noting the corrupted file(s)
  - using 'cp -dpR', copy all the remaining files and sub-dir.s from the
damaged root partition on /dev/hda1  onto a temporary sub-dir on
/dev/hda2
  - wipe clean and re-establish a clean filesystem on /dev/hda1 using
mke2fs
  - BTW, both times - the (varying) bad block on /dev/hda1 has been
'fixed' by this, ie the badblocks utility no longer identifies anything
wrong with it
  - re-establish or recreate the affected (now non-existent) files
  - copy all the files/sub-dirs back from /dev/hda2 to /dev/hda1
  - run 'lilo -r /mnt/hda'
  - reboot

This has fixed the problem but
  - it's a major pain
  - this has happened twice now in a 5 week period
  - as best as I can remember 5 weeks back <s>, I wasn't doing any
single operation to blame this on
  - previous time the corruption occurred in the 8th group on the root
partition, affecting files in the /dev sub-dir
  - this time it occurred in the 6th group, affecting files in
/etc/rc.d/rc2.d

I'm wondering
  - anyone have any similar experience and able to take a guess as to
what is CAUSING this
  - failing that, any simpler means of fixing the problem

System
  - Dell Pentium 90, two hard drives
  - RedHat V4.1 on master HD
  - Windows 95 (don't ask <s>) on slave HD
  - SoundBlaster sound card
  - 4x CD
  - 33.6 internal USR Sportster modem
  - CMS internal tape drive
  - ask and you shall receive <s>

TIA, Layne
--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Internet Services Tech, Unix/Programming Wiz
Work: Internet Wizards, 212 Railroad Ave N, Kent, WA 98032

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

 
 
 

Cause of HD Block Corruption

Post by Dick Altenber » Sat, 14 Jun 1997 04:00:00



> I need some help.

> Twice now (last night and in early May) I've had problems with a block
> on my HD going 'bad'.

> Been widely scattered blocks, but both times they've been in the inode
> tables of one of the 13 groups that my root partition is divided into.
> This simultaneously screws up the reading of several files and has
> resulted in kernel panics when I try to boot my machine.

> The specific error mssg.s have been:
>   "hda: read-intr: status=0x59  {DriveReady SeekComplete DataRequest
> Error}
>    hda: read-intr: error=0x40  {UncorrectableError} LBAsect= 98407
> sector=98344  (these two numbers have varied)
>    end request: I/O error, dev 03:01, sector 98344
>    Kernel panic: ExT2-fs panic (device 03:01): ext2_read_inode: unable
> to read i-node block - inode=12027 block=49172  (these two numbers have
> varied)

> Given the bad block is in the inode table itself, I can't seem to repair
> the problem in-situ.  I have been able to effect repairs by
>   - boot with emergency boot disk
>   - mount /dev/hda1  and  /dev/hda2
>   - run e2fsck on the damaged /dev/hda1 - noting the corrupted file(s)
>   - using 'cp -dpR', copy all the remaining files and sub-dir.s from the
> damaged root partition on /dev/hda1  onto a temporary sub-dir on
> /dev/hda2
>   - wipe clean and re-establish a clean filesystem on /dev/hda1 using
> mke2fs
>   - BTW, both times - the (varying) bad block on /dev/hda1 has been
> 'fixed' by this, ie the badblocks utility no longer identifies anything
> wrong with it
>   - re-establish or recreate the affected (now non-existent) files
>   - copy all the files/sub-dirs back from /dev/hda2 to /dev/hda1
>   - run 'lilo -r /mnt/hda'
>   - reboot

> This has fixed the problem but
>   - it's a major pain
>   - this has happened twice now in a 5 week period
>   - as best as I can remember 5 weeks back <s>, I wasn't doing any
> single operation to blame this on
>   - previous time the corruption occurred in the 8th group on the root
> partition, affecting files in the /dev sub-dir
>   - this time it occurred in the 6th group, affecting files in
> /etc/rc.d/rc2.d

> I'm wondering
>   - anyone have any similar experience and able to take a guess as to
> what is CAUSING this
>   - failing that, any simpler means of fixing the problem

> System
>   - Dell Pentium 90, two hard drives
>   - RedHat V4.1 on master HD
>   - Windows 95 (don't ask <s>) on slave HD
>   - SoundBlaster sound card
>   - 4x CD
>   - 33.6 internal USR Sportster modem
>   - CMS internal tape drive
>   - ask and you shall receive <s>

> TIA, Layne
> --
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Internet Services Tech, Unix/Programming Wiz
> Work: Internet Wizards, 212 Railroad Ave N, Kent, WA 98032

> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Layne,
   I have had this exact same problem and was able to get around it by
using badblocks utility in write to fix it.  After a while, though,
there got to be more and more blocks that I was unable to fix and
finally I had to get rid of the hard drive.  I'm pretty that it is a
hardware problem with the drive.  Do you have a Seagate HD?  The drive
that I had that had the problem was a 540Mb Seagate IDE drive.
         Dick Altenbern

 
 
 

1. cdrom to hd directory copy causes fs corruption!?!?!

Howdy...

Now that I have my cdrom (/dev/hdc) working, when I try to
copy a directory - or directories - the system hangs.
I reset the system and I get a ``file system error, run e2fsck
check manually'' message.
e2fschk is run and I recover the fs.
I can read files off the cd, but any multiple file copying
causes this hang.
Haven't gotten a chance to try WorkMan (audio) yet until I
get familiar with the plug n' pray utils for my SB16 PnP.
This happens with both the Slackware 1.3.20 and 2.0.12 kernels.
Anybody got any ideas on what could be causing this and how
to resolve it?

Thanx in advance,

2. e2fsprogs 1.04

3. Bootdisk create attempt on ThinkpadA21m/Caldera appears to have caused corruption

4. Quick Start For Netscape?

5. UDMA causes IDE corruption on Shuttle AK32L mobo (VIA KT266A), kernel 2.4.1[89]

6. JFS on DVD-RAM

7. New: Removing USB mass storage causes slab corruption, oops

8. blowfish applications in C and perl Crypt::Blowfish(_PP)

9. TX-chipset memory problem causing file corruption; workaround solution.

10. kernel 2.2.9 and 2.2.10 can cause massive ext2-corruption ?

11. IDE 48 bit addressing causes data corruption

12. How would I find the cause of stack corruptions?

13. 'shutdown' unable to umount partition, causing disk corruption