1. When good filesystems go bad...
(Apologies if offtopic; I couldn't figure out where else this should
go.)
To start: Running SuSE 7.0 with kernel 2.4.3 on a T-bird 850 with a VIA
686 chipset, using ReiserFS for all partitions. About 2 weeks ago, had
some sort of odd machine-freeze which left a big chunk of zeroes at the
end of /var/log/messages. Didn't think too much about it, just rebooted
and the machine seemed fine.
Now ReiserFS is a good thing... except when it goes bad. No need to
fsck, great performance, etc. So, what happens when you get a string of
kernel OOPSes related to ReiserFS? My plan was to restore from backup,
until I realized I didn't have the KDE2 binaries or libs backed up, and
slurping ~40M of RPMs across a dialup connection is painful.
Booted from the SuSE rescue CD, then ran reiserfsck. It exited with an
error to the effect of "couldn't open journal." Ground my teeth,
grabbed my laptop, used it to download the latest versions of the
ReiserFS tools from http://www.namesys.com/ . Stuck the binaries of the
tools and their manpages on a floppy, then ran reiserfsck from the
floppy.
"Wow, look at all the errors it's outputting! Things must have been
seriously damaged." Reboot the machine. Problems persist. Read man
page again, realized that reiserfsck is totally different from e2fsck.
Tried again from the rescue system, using the -x option to reiserfsck.
("Fix fixable errors") This time, the output indicates that there were
problems on /var and / . Yikes. Reboot again, trusting that all is
well.
Nope. Try rescue system for the 3rd time, run reiserfsck with the
--rebuild-tree option ("Try to rebuild filesystem from scratch.") Yep,
there were bad problems with / . Finally, everything is working right!
...Or not. Booting normally coughs up "/dev/null: Not found". AARGH!
Boot rescue system for the 4th time, comb through /dev for mangled or
missing devices. /dev/null and /dev/dsp* all existed, as character
devices with major and minor numbers of 0. Used mknod to recreate them.
Finally, the system boots normally. I'm not sure if I trust it though,
and will restore from the aforementioned backup once I've gotten those
KDE2 RPMs and put them somewhere safe.
The practical upshot:
0. Don't Panic
1. Make Backups Frequently
2. reiserfsck is not as straightforward as e2fsck
3. There may be problems even with 2.4.3 and the VIA686 chipset.
Hope this helps someone....
--
Matt G|There is no Darkness in Eternity/But only Light too dim for us to see
Brainbench MVP for Linux Admin / Workin' in a code mine, hittin' Ctrl-Alt
http://www.brainbench.com / Workin' in a code mine, whoops!
-----------------------------/ I hit a seg fault....
2. Stylewriter II and lpr
3. bad partition on good disk won't mount or fsck (bad magic number/superblock)
4. Corrupted .bash: How to recover?
5. Filesystem semantics protecting meta data ... and users data
6. How to restore partition table with gpart?
7. How do you add a bad block to e2fs bad block list?
8. Volume Manager
9. ROOTVG going bad -HELP
10. How to remove a hdisk which is not belong to rootvg but has a rootvg label
11. hd3 outside of rootvg is bad news
12. About mirror rootvg and the contents in the rootvg automatically synchronize
13. Replacing bad rootvg disk...