Hi there,
My company is running a web server on a dual-processor
Pentium II 400Mhz box with 256Megs of RAM. The O/S is
SuSe Linux 6.0 with a 2.2.9 kernel.
This morning I got a call from my manager saying that our
"web site was down." Fearing the worst, I ssh'ed to the
machine and found the following:
1. `top' showed that init was taking 99.9% of the CPU
2. cat /proc/kmsg (as root) showed a bunch of
'Out of memory for . . .' messages
3. `free' showed about 8K RAM available
4. /var/log/console had a few 'Unable to load interpreter'
messages
After su'ing to root, I stopped the web server (Apache 1.3.4) and
started killing processes with SIGTERM one by one but the amount of
free memory did not increase significantly. /etc/inittab looked
OK. I then tried to "kick" init with `/sbin/telinit q' and
`/sbin/telinit u' but to no avail. Finally, I decided to reboot
but that didn't work either! I tried these
/sbin/shutdown -r now
/sbin/reboot
/sbin/telinit 6
but the box just kept on going. I ended up calling the NOC where
the box is located and asking them to toggle the switch :-(
Another funny thing is that when I looked at the files in /var/log
later in the day they had no entries for the period from Oct 28 13:xx
till the moment of reboot. It almost looked like someone edited them . . .
Being an aspiring sysadmin, I would truly appreciate any input on
these symptoms. My main questions are
1. What can prevent things like /sbin/reboot from working?
Was it because I was only su'ed to root instead of
being logged in from the console?
2. What does the 'Unable to load interpreter' message mean?
3. Does anyone know of any memory leak problems in Apache 1.3.4,
especially in the `rotatelogs' program?
A few more details (sorry for such a long explanation:-) :
1. I am not the only person who knows the root password and
there's nothing I can do about it
2. I inherited this host as it is and was told not to make
any drastic changes to it
3. The host had blatant security holes: I did my best
to fix them but I am sure there are more (I was told
_not_ to fix at least one because it was a "feature")
Once again, all comments and suggestions are very much welcome.
Please reply to the newsgroup or directly to
Many thanks in advance,
Sergey