Ever since I upgraded to Solaris 2.5 our system has had processes that
just decide suddenly to go into a loop, consuming most of the CPU
resources. The machine in question has about 10,000 users, of whom at
any one time there are about 100 logged on. The looping processes make
the machine very slow and eventualy almost unusable. I have to
constantly monitor the system and kill them. We have had no answers
from sun as yet, the problem started at the beginning of March. The
system is not used for X so almost all the processes are text based.
The processes all have certain characteristics. They are invoked from a
C program with a system call. They are all programs that require
keyboard input and access the network like "telnet". When I truss them
they are mostly looping while doing a read and getting a "0" (zero)
back. The processes only seem to go rogue when there are more than 400
processes running on the system in total. I have seen these run-away
processes on other less highly used systems but much more rarely.
Could you please send any replies to me via email as my newserver is not
functioning correctly at the moment. I will summarise responses if I
get any and anyone else is interested.
--
+-------------------------+--------------------------------------------+
| Computer Centre ISD | PGP Public Key : |
| PO Box 1 Belconnen | Phone : (06)201 5512, (06)201 5500 |
| ACT Australia 2617 | Fax : (06)201 5502, (06)201 5501 |
+-------------------------+--------------------------------------------+
| born again and again and again and again and again buddhist |
| well... in a previous life anyway :-) |
+----------------------------------------------------------------------+