2.4.18-pre8 - Good news and bad news...

2.4.18-pre8 - Good news and bad news...

Post by Michael H. Warfiel » Thu, 07 Feb 2002 13:50:06



All,

        I've been trying to work around a problem since 2.4.16 where
the kernel would Oops on my gateway system with the EIP = 0010:5a5a5a5a.

        Several people had excellent suggestions, most of which had
no effect beyond eliminating the innocent (which is what I would hope
for anyways).  My thanks to all.

        One person asked if I was using DevFS.  I was, since I also work
on and test the Computone Multiport drivers and need to have that working
with DevFS.  They suggested disabling DevFS which I did and which had
absolutely no effect on the 5a5a5a5a problem.

        After seeing a post from Alan Cox about 2.4.18-pre7, I compiled
that up and started the gateway on it.  After it ran for a day (which
previous versions had NOT done) I left it to run for the week I was
in New York for LinuxWorld Expo.  OK, So I'm a DAMN IDIOT who likes to
live dangerously.  My SO knew how to reboot the gateway in case it went
*up, which it did almost a week later.  It was set up to reboot to
a safe kernel (2.2.20) till I could get back and autopsy the corpse.
No problem...  :-)

        The good news...  The 5a5a5a5a Oops seems to be gone.  The Oops
in 2.4.18-pre7 was different and took a LOT longer to blow.  I didn't
try to reproduce it, since 2.4.18-pre8 was out.

        Now for the bad news...

        I build a new kernel with 2.4.18-pre8 and the latest FreeS/WAN
(1.95).  The only reason I mention FreeS/WAN is that this is the only
kernel mod from the stock tree and patches.  I also turned DevFS back
on, so I could test that.  I discovered that 2.4.18-pre8 would only
come up about (actually exactly) 50% of the time.  If it was a clean
reboot, I would get an Oops in the scheduler pretty early during
initialization.  The EIP in the Oops seemed different every time.
If it had to fsck the file systems, it wouldn't generate an Oops and
would boot, but I had trouble with a site which was logging into my
gateway over PPP and the Computone board.  There seem to be no traffic
after the initial "CONNECT" (which occurs with CF unaccerted).  Problems
just seemed to be bizzare and unpredicatable beyond the "every other boot"
weirdness.

        A little more investigation indicated that the every-other boot
Oops was occuring EXACTLY when the system was mounting "other filesystems"
which, in this case, meant DevFS and usbdevfs.  That's when I remembered
re-enabling DevFS in the build.  I re-disabled DevFS, rebuilt the kernel,
and then the system booted and my remote site managed to log in successfully,
first time, no problem.  This was after MULTIPLE failures with DevFS enabled.
(Hours of time shot.)

        Sooo...

        Good news...  My reported 5a5a5a5a Oops appears to have evaporated
with the changes that went in around 2.4.18-pre7.  Congrats and thanks!

        Bad news...  DevFS SEEMS to have problems.  Since I disabled it
early in the 2.4.18-pre series and could never get 2.4.17 stable, I have
no idea where the problem was introduced.  I did NOT see this in 2.4.16.

        I'm not looking for suggestions this time around, just reporting
observations.  My gateway is now running 2.4.18-pre8 and I'll report any
Oops if and when it occurs and deal with that then.  I may see if the
DevFS problem exists on my other systems, but my high traffic gateway
has been the only system to exhibit some of these problems to date.

        Mike
--

  /\/\|=mhw=|\/\/       |  (678) 463-0932   |  http://www.veryComputer.com/
  NIC whois:  MHW9      |  An optimist believes we live in the best of all
 PGP Key: 0xDF1DD471    |  possible worlds.  A pessimist is sure of it!
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://www.veryComputer.com/
Please read the FAQ at  http://www.veryComputer.com/

 
 
 

2.4.18-pre8 - Good news and bad news...

Post by Daniel Phillip » Thu, 07 Feb 2002 17:40:10



Quote:>    After seeing a post from Alan Cox about 2.4.18-pre7, I compiled
> that up and started the gateway on it.  After it ran for a day (which
> previous versions had NOT done) I left it to run for the week I was
> in New York for LinuxWorld Expo.  OK, So I'm a DAMN IDIOT who likes to
> live dangerously.  My SO knew how to reboot the gateway in case it went
>*up, which it did almost a week later.  It was set up to reboot to
> a safe kernel (2.2.20) till I could get back and autopsy the corpse.
> No problem...  :-)

>    The good news...  The 5a5a5a5a Oops seems to be gone.  The Oops
> in 2.4.18-pre7 was different and took a LOT longer to blow.  I didn't
> try to reproduce it, since 2.4.18-pre8 was out.

and the post from Alan was?

--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://www.veryComputer.com/
Please read the FAQ at  http://www.veryComputer.com/

 
 
 

1. Very High Load on Disk Activity in 2.4.18 (and 2.4.18-pre8)

Hello,

I'm experiencing a strange effect. As soons as there is some higher disk
activity (untarring the linux kernel is enough, which really should be no
problem) the system load gets really high (some times over 10) but the CPU
is 100% idle (reported by top).

Usually the system freezes for some minutes (but is still replying to pings)
and is after this period (when disk I/O is finished) fully usable again.

There are no error messages in all logfiles and nothing seems to be wrong
except of this _really_ annoying 3-10 minute freezes under high activity.

I definatly have no idea what to do. I'm very shure that my SCSI-wiring
and the other hardware ist ok. There a no error messages or warnings in
"dmesg" or the normal system logs.I run various Hardwaretest, all are ok.

I read about similar problems in earlier Version of 2.4.x Kernels but these
are reported to be fixed. Are any Problems in the Mylex drivers or the
kernels VM known (and hopefully work arounds) that can cause this
behaviour?

thanks in advance and sorry for my bad englisch.

regards

jan schreiber

More info about the machine:

- Mylex Accleraid 352 (flashed to latest firmware)
- Intel Pentium III 1,2 GHz, 512 MB Ram, Intel i815 Chipset
- 10 36 GB SCSI drives connected (8 on Channel 0, 2 on channel 1), no other
devices connected
- Build one RAID-5 Drive out of 9 disks, the 10 disk is spare
- three primary Partitons (512 MB swap, 2 GB system , rest data partiton
(approx 270 Gb)
- Running SuSE 8.0 Pro
- Kernel 2.4.18 (from suse, built a new vanilla 2.4.18 and pre8 with same
results)
- using reiserfs 3.6 on all Partitons (except swap of course)

Mit freundlichen Gruessen
cionix GmbH

Jan Schreiber
Gesch?ftsfhrer | chief executive officer

_____________________________________
cionix GmbH
Bredower Str. 45       D-14612 Falkensee

Tel: [+49] ?3322.2336-10
Fax: [+49] ?3322.2336-11


_____________________________________

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

2. exabyte tape drive problems

3. USB lockup in 2.4.18-pre8 <-- typo 2.4.19-pre8

4. ISDN / Network

5. 2.4.18-pre8 + 2.4.17-pre8-ac3 + rmap12c + XFS Results

6. Newbie Question

7. Good news, bad news

8. Promise 2300+ driver question

9. Linux Kernel Crash - Vanilla 2.4.18/Redhat 2.4.18-5 (2nd try =) )

10. Linux Kernel Crash - Vanilla 2.4.18/Redhat 2.4.18-5

11. what is the difference between 2.4.18-14 and 2.4.18-17.8.0

12. Linux 2.4.18-pre8

13. 2.4.18-pre8-mjc: compile errors