FreeBSD 4.4 - STABLE lockup

FreeBSD 4.4 - STABLE lockup

Post by mic.. » Fri, 05 Oct 2001 18:17:17



Hello,

this morning i have encountered a very annoying situation
with a machine i had upgraded to 4.4-Stable yesterday,
I describe it here in case somebody has seen something similar
or it may be useful to someone.

I was writing to a dos floppy when the machine locked solid.
The last message logged by syslog is
Oct  4 09:15:58 niobe /kernel: fd0c: hard error writing fsbn 19 (ST0
40<abnrml>
Note that the floppy was mounted so this may be a remnant of the infamous
bug about writing to write protected floppies, which i thought had been
eliminated. But the machine did not panic.

After that the screen remained normal, but keyboard and mouse did not
respond. Machine answered ping but i could not log in through network
to reboot and had to hit the reset button.

Unfortunately reboot went very bad since fsck was not able to preen the
root partition and i was asked to run fsck manually. But superblocks
0 and 32 were corrupted with BAD MAGIC NUMBER. A thing i have never seen
previously. Of course i had no copy of the position of alternate superblocks.
After an hour of reading man pages, i discovered that newfs has an option
-N allowing to fake the creation of a filesystem. I ran
newfs -N /dev/da0s1a
which showed that one copy of the superblock was at 65558.
Fortunately
fsck -b 65558 /dev/da0s1a
worked and offered to repair the superblocks. A trick which is useful
to remember! After that the machine was able to reboot.

Conclusion: i suspect there is a * bug in the filesystem management
at present (since the introduction of dirpref?). Note that
softupdates is not the culprit since root was the only partition
running with softupdates disabled. Incidentally, i had fscked manually
all the other partitions (softupdates enabled) without problem.

--
Michel Talon

 
 
 

FreeBSD 4.4 - STABLE lockup

Post by Roy Shimmy » Sat, 06 Oct 2001 07:22:38



> Hello,

> this morning i have encountered a very annoying situation
> with a machine i had upgraded to 4.4-Stable yesterday,
> I describe it here in case somebody has seen something similar
> or it may be useful to someone.

> I was writing to a dos floppy when the machine locked solid.
> The last message logged by syslog is
> Oct  4 09:15:58 niobe /kernel: fd0c: hard error writing fsbn 19 (ST0
> 40<abnrml>
> Note that the floppy was mounted so this may be a remnant of the infamous
> bug about writing to write protected floppies, which i thought had been
> eliminated. But the machine did not panic.

> After that the screen remained normal, but keyboard and mouse did not
> respond. Machine answered ping but i could not log in through network
> to reboot and had to hit the reset button.

> Unfortunately reboot went very bad since fsck was not able to preen the
> root partition and i was asked to run fsck manually. But superblocks
> 0 and 32 were corrupted with BAD MAGIC NUMBER. A thing i have never seen
> previously. Of course i had no copy of the position of alternate superblocks.
> After an hour of reading man pages, i discovered that newfs has an option
> -N allowing to fake the creation of a filesystem. I ran
> newfs -N /dev/da0s1a
> which showed that one copy of the superblock was at 65558.
> Fortunately
> fsck -b 65558 /dev/da0s1a
> worked and offered to repair the superblocks. A trick which is useful
> to remember! After that the machine was able to reboot.

> Conclusion: i suspect there is a * bug in the filesystem management
> at present (since the introduction of dirpref?). Note that
> softupdates is not the culprit since root was the only partition
> running with softupdates disabled. Incidentally, i had fscked manually
> all the other partitions (softupdates enabled) without problem.

> --
> Michel Talon

I had a similar lockup, but mine did not occur doing anything with a
floppy. I was updating perl modules through cpan, and it completely
froze. no mouse or keyboard input. I got less information from my
experience than you did, because there didn't seem to be any evidence of
it. Not even with the 'last' command. The only other time my computer
crashed before that, it was reported when I issued the 'last command.

An interesting noe about this crash is that I was able to duplicate it.
I don't know which factors are directly responsible, but this is what I
was running both time. I had setiathome running on both cpus, Xfree86
4.1 with icewm, Netscape, Gaim, rxvt, make clean, and a perl module
install via cpan. I immediately uninstalled setiathome, mostly because I
decided it wasn't really important for me to help find aliens at the
expense of every free CPU cycle. I haven't had any crashes since, but I
haven't tried the remaining combination of running programs because I
really didn't want to see my computer crash like that again. It felt
like I was using a Mac or something. Very bad. I had never seen FreeBSD
crash until I updated to 4.4, so I feel this is something serious.
--
Free your ass and your mind will follow.