2.0.31-pre7 et al: EXT2-fs panic bad inode number: 0

2.0.31-pre7 et al: EXT2-fs panic bad inode number: 0

Post by Andy Burge » Fri, 22 Aug 1997 04:00:00



Kernel panic: EXT2-fs panic (device 08:03): ext2_read_inode: bad inode number: 0

This is my /usr partition. This almost always happens during heavy SCSI
bus activity (often at 2AM when the incremental backup starts) Its
happened with 2.0.31pre5,6,7 and 2.0.30 & 29. I believe it started
somewhere in the 2.0.2x series. I am downloading those kernels now in an
effort to find out. It happens every few days. I searched dejanews for
this message and found one post with a kernel patch. That patch looked
to be incorporated in the 2.0.30 kernel so I upgraded from 2.0.29. It
seemed that the frequency went down to every few weeks at that point.
However the problem was worsened by my recently consolidating my
fulltime PPP connection and UUCP connections to this machine. It also
gets a small (100Mb/day) newsfeed.

This is a 486-66 VLB with a buslogic BT445S controller, HP 2Gb disk and
DAT tape, 32Mb parity RAM, WD8013 ethernet.

I never see any other error messages. I ran badblocks without error.  I
made it happen in a few minutes with 10 simultaneous 10Mb bonnies. The
second time I tried that (after a reboot) the machine locked up solid
(no disk activity)

My questions:

1) Can I supply more info to help track this down? Is it possibly a
hardware problem? (the message is always exactly the same) Any
particular kernel version to try?

2) How can I make the machine reboot when this happens? Regardless of
/proc/sys/kernel/panic the machine never reboots by itself. In fact all
attempts at soft rebooting fail. df, shutdown, sync, telinit and
ctl-alt-del all hang. My usual method is to kill crond, pppd and inn (in
an effort to close as many open files as possible) and then hit the
reset button. Do I need to use the software watchdog to force a reboot?  

Many thanks for any help.
Andrew Burgess

 
 
 

2.0.31-pre7 et al: EXT2-fs panic bad inode number: 0

Post by Andrew Hoddinot » Sat, 23 Aug 1997 04:00:00


Quote:Andy Burgess writes:

    Andy> Kernel panic: EXT2-fs panic (device 08:03): ext2_read_inode:
    Andy> bad inode number: 0 This is my /usr partition. This almost
    Andy> always happens during heavy SCSI bus activity (often at 2AM
    Andy> when the incremental backup starts) Its happened with
...

I have had similar problems with one of our servers since upgrading to
2.0.30 with 2.0.31-pre patches applied (I'm not sure which, I'm afraid
-- RedHat 2.0.30-3 source package recompiled with mainly default
configuration except that ncr53c7xx driver is not modular). We have 10
user machines running _exactly_ the same kernel without fault and two
further servers running otherwise identically configured but SMP
kernels.

Only this one machine displays the problem -- it is the only non-SMP
machine to actually have a SCSI disk in, but both the SMP machines do
and are seemingly fault-free -- and anyway its not the SCSI disk that
is getting corrupted, although I too get the feeling it happens
during/after backups.

I have been getting regular corruption of the root and /var
partitions, with the kernel complaining of file table overruns, bad
inode numbers, bad block numbers etc. After it had happened a couple
of times I downed the server, fsck'd the disk, and copied the entire
contents onto a spare disk with cpio. A couple of weeks later and the
errors started again, now almost daily (lucky this is only our primary
NIS, NFS and DNS server and not something important, eh ;-).

Following a power failure last weekend the machine rebooted unattended
with no problems, but a couple of hours later failed again this time
with total unrecoverability -- e2fsck -y generates a lost+found
directory with several thousand numbered inodes in and where files are
preserved with names in tact in a subdirectory the files often contain
the wrong things.

I have reconstructed the entire machine (except the /home SCSI disk
which seems fine even now :-) from scratch and so far there are no
problems, but it would be really nice to track down what went wrong.

If there are any gurus out there who can suggest some sensible
diagnostics to run I still have a pristine copy of the dead /var
partition -- I happened to have another partition of identical size to
hand so I ran the e2fsck on a dd'd copy and the original crashed disk
is just as it was :-) I looked at the ext2ed docs, but I'm afraid
things were a bit beyond me and I was more concerned with getting the
systems up again before everyone came in to work the following morning.

Suggestions please?

--
Andrew Hoddinott                    Advanced Rendering Technology Ltd.
Software Engineer               Mount Pleasant House, 2 Mount Pleasant

http://www.art.co.uk/                 Tel. +44 1223 563854 Fax. 516520

P.S. It may just be that it is the last thing to try and write the
disk before the whole shebang goes down, but _all_ of the crashes are
preceded by syslogd complaining bitterly that many of its log files /
directories don't exist any more -- so it might conceivably be part of
the problem.

 
 
 

1. PROBLEM: ext2-fs panic (inode=2, block=16777477)

   I run linux on one of my PC's, and its usually on all the time, and
usually have no problems with it, except one time after being on for a
few of days I go to shut it down and get:

/sbin/halt: Can not exec, wrong architecture.

   I looked around a bit, and *everything* was screwed, empty dirs,
"cannot find .", very large file sizes, etc.  One of the only things that
did work was a sync, so I synced the filesystem and turned the computer off.

   Now when I reboot it I get:

Kernel panic: ext2-fs panic: /dev/hda2 Cannot read Inode
inode=2, block=16777477.

   Then the system just sits there.  I have tried booting from floppy and
mounting the partition, but it does the same (but doesnt report the
inode).  I have also used the setup util to mount the drive, it also
hangs, but I can break out to shell, but any attempted access to the disk
will lock up the system (due to incomplete mounting).  So, in short, I can
not mount it AT ALL.

   Re-installing the system, yes, thats what I will do, but I have some
important files on that system (particularly source code that I dont want
to have to reprogram) that I need to retreive if possible.  Is there any
way that this can be fixed or my files can be retreived?  Even
partially?  Where can I get the utils, etc?

   Any help would be GREATLY APPRECIATED!  Please e-mail me the response,
post it if you wish, but at least copy me in email, I'd like to get this
fixed ASAP if thats even possible.  Thanks!


2. RH5.1 2.0.34 Problems with BM-DMA IDE drive

3. inode crashed root partition/EXT2-fs panic

4. Database Sparse Files

5. fs/inode.c sync fix and fs/ext2/inode.c tidy

6. TIN information?

7. What is Kernel Panic: EXT2-fs panic (Device 3/65)

8. enabling pop3 ssl

9. Kernel panic: EXT2-fs panic on 2.5.4-pre3

10. Kernel panic: EXT2-fs panic

11. HELP! Kernel Panic: EXT2-fs panic ...

12. Kernel panic: EXT2-fs panic (device 3/65): ext2_find_entry: buffer head pointer is NULL

13. Kernel panic: EXT2-fs panic