Strange problem: init taking 99.9% of CPU + cannot reboot

Strange problem: init taking 99.9% of CPU + cannot reboot

Post by umask 0 » Thu, 04 Nov 1999 04:00:00



Hi there,

My company is running a web server on a dual-processor
Pentium II 400Mhz box with 256Megs of RAM.  The O/S is
SuSe Linux 6.0 with a 2.2.9 kernel.

This morning I got a call from my manager saying that our
"web site was down."  Fearing the worst, I ssh'ed to the
machine and found the following:

        1. `top' showed that init was taking 99.9% of the CPU
        2. cat /proc/kmsg (as root) showed a bunch of
           'Out of memory for . . .' messages
        3. `free' showed about 8K RAM available
        4. /var/log/console had a few 'Unable to load interpreter'
           messages

After su'ing to root, I stopped the web server (Apache 1.3.4) and
started killing processes with SIGTERM one by one but the amount of
free memory did not increase significantly.  /etc/inittab looked
OK.  I then tried to "kick" init with `/sbin/telinit q' and
`/sbin/telinit u' but to no avail.  Finally, I decided to reboot
but that didn't work either!  I tried these

        /sbin/shutdown -r now
        /sbin/reboot
        /sbin/telinit 6

but the box just kept on going.  I ended up calling the NOC where
the box is located and asking them to toggle the switch :-(

Another funny thing is that when I looked at the files in /var/log
later in the day they had no entries for the period from Oct 28 13:xx
till the moment of reboot.  It almost looked like someone edited them . . .

Being an aspiring sysadmin, I would truly appreciate any input on
these symptoms.  My main questions are

        1. What can prevent things like /sbin/reboot from working?
           Was it because I was only su'ed to root instead of
           being logged in from the console?
        2. What does the 'Unable to load interpreter' message mean?
        3. Does anyone know of any memory leak problems in Apache 1.3.4,
           especially in the `rotatelogs' program?

A few more details (sorry for such a long explanation:-) :

        1. I am not the only person who knows the root password and
           there's nothing I can do about it
        2. I inherited this host as it is and was told not to make
           any drastic changes to it
        3. The host had blatant security holes: I did my best
           to fix them but I am sure there are more (I was told
           _not_ to fix at least one because it was a "feature")

Once again, all comments and suggestions are very much welcome.
Please reply to the newsgroup or directly to


Many thanks in advance,
Sergey

 
 
 

1. gimp-0.99.9: cannot enter text

As soon as I click on the text-button, gimp crashes with a
SIGSEGV.

The gdb output is:

Program received signal SIGSEGV, Segmentation fault.
0x80bafb7 in text_create_dialog (text_tool=0x83724f0) at text_tool.c:479
479       font_infos[0] = font_info[0]->foundries;

Everything else seems to work fine.

-Norbert

2. help with web server and dynamic ip address please

3. can't reboot, init high CPU

4. Dsl-200

5. Use of GIMP-0.99.9 scripts

6. windows 2000 box and sun ultra 1 box with solaris 8

7. Can't login to .99.9 please help

8. secondary IDE controller problems....

9. Warning ! Bug in keyboard driver of 99.9

10. X331+pgcc: compiled 99.9% ok

11. Help compiling 99.9 kernel.

12. 3c503 and 99.9 kernel - assistance (compiling) needed

13. 99.9 kernel bounces logins why ??