linux stability problem

linux stability problem

Post by Paul H. Hargrov » Thu, 03 Apr 1997 04:00:00



[Sorry to those who see this twice.  It went to c.o.l.d.a by mistake
the first time, so I am reposting it here.]

[snip]

Quote:> This looks bad. Without any limits set, the system should swap
> for a long time and then kill the process. Unless this poster
> failed to wait long enough, page tables ate all the memory.

[snip]

Has anyone yet verified that the problem is one of page tables filling
all of physical memory?  If so, I'd consider this to be a serious bug
which we should try to consider solutions to.  Note that I am trying to
initiate discussion.  I am not volunteering to do the work.  I really
don't know enough about the mm subsystem in Linux to do much.

The first thing that comes to mind is to abandon lazy allocation.
However, I have no intention of starting that thread again.  Besides,
even without lazy allocation if I have very little physical memory and a
lot of swap space then it is still possible to over-run physical memory
with page tables before virtual memory becomes over committed. So let's
look for other solutions.

Another possibility is to make the page tables swappable.  Note the
distinction between swapping and paging.  What I am suggesting is that
in the rare case when all of physical memory is taken up with page
tables select a process (a sleeping one if possible) and write ALL of
its page tables to swap space and reclaim the memory those page tables
previously occupied.  Since this is (hopefully) a rare condition there
shouldn't be too many concerns about the granularity.  I doubt there is
really a need for LRU paging of page tables.  The page tables for the
kernel would, of course, not be swappable.

A third possibility which occurs to me is to add per-process and/or
per-user resource limits for page table space.  The problem I see with
this is that many things need to be changed if we add something to
struct rusage, such as the (u)limit built-in shell commands.  Of course
if this were done with a similar but separate mechanism we'd not need to
change a bunch of stuff.  This also doesn't cure the problem, since the
only way this mechanism would be 100% effective on a machine with many
users would be to make the limit unusable small.

The simplest solution which comes to mind is just to kill a process when
we are unable to allocate a page for the page tables.  This could either
be the process which uses the most memory, the most page tables space
(not always the same as most memory), or the process which needed the
page.  As I understand it this is what is done now when a "normal" page
is needed for a process.

Comments, corrections and cash are welcome :-)
--
Paul H. Hargrove               All material not otherwise attributed

 
 
 

linux stability problem

Post by Joerg Senekowitsc » Fri, 04 Apr 1997 04:00:00




>The simplest solution which comes to mind is just to kill a process when
>we are unable to allocate a page for the page tables.  This could either
>be the process which uses the most memory, the most page tables space
>(not always the same as most memory), or the process which needed the
>page.  As I understand it this is what is done now when a "normal" page
>is needed for a process.

Hm, this sounds vaguely familiar. Isn't that what AIX does when it runs
out
of memory? Randomly wacking processes left and right, mostly killing
the innocent? I've been puzzled more than once when suddenly jobs were
missing from our AIX box that were alive just seconds ago :-(

There must be a better solution.

Joerg

 
 
 

linux stability problem

Post by Greg Walk » Sun, 06 Apr 1997 04:00:00



    >> The simplest solution which comes to mind is just to kill
    >> a process when we are unable to allocate a page for the
    >> page tables.  This

    Joerg> Hm, this sounds vaguely familiar. Isn't that what AIX
    Joerg> does when it runs out of memory? Randomly wacking
    Joerg> processes left and

    Joerg> There must be a better solution.

It seems linux should have a certain small amount of swap space
permanently reserved for the superuser. That way the sysadmin
can recover the system when a user program malloc()'s out of
control.

 
 
 

linux stability problem

Post by stephen farrel » Sun, 06 Apr 1997 04:00:00






>     >> The simplest solution which comes to mind is just to kill
>     >> a process when we are unable to allocate a page for the
>     >> page tables.  This

>     Joerg> Hm, this sounds vaguely familiar. Isn't that what AIX
>     Joerg> does when it runs out of memory? Randomly wacking
>     Joerg> processes left and

>     Joerg> There must be a better solution.

> It seems linux should have a certain small amount of swap space
> permanently reserved for the superuser. That way the sysadmin
> can recover the system when a user program malloc()'s out of
> control.

i think the problem with this proposal is that there will be many
processes owned by root (e.g., daemons) that will be simultaneously
competing for this space with the interactive process you *want* to
get it.

so, if you reserved 5% of the memory, then the offending user process
would effectively be squeezing the root processes out of the 95% and
into the 5%, with no net improvement.

i think if it were this easy it would have been solved ages ago.  NT
doesn't do much better, btw; at least in my experience it's about the
same -- grinds like a mother*er, extremely irresponsive, but will
eventually allow the process to be killed.  (well, this is what i get
with linux; i guess others posting in this thread have had different
results... solaris was the most frustrating b/c it simply commits all
memory and then no one can do anything, period).

imho, the best solution would be to move some kind of ulimit utility
into the kernel like quota support on filesystems.  then su could run
some utility (or just echo > /dev/proc/ulimits) and set the various
ulimit stuff.  i also think that for this to be useful, it should also
allow precedence to certain users, or, better yet, groups.  admins
(and myself when i'm running as a normal user on my own computer)
should be allowed to use almost all cpu time and virtual memory, but
joe user (i.e., untrusted users) should not.  on freebsd, out of the
box they have a bunch of ulimit settings like max processes, max cpu
time per process, etc.  i find this extremely annoying, since my box
is basically a single-user 95% of the time, and i want all of the
resources its got. (i know this is supposed to be configurable, but
having to set environments in shells is too much of a pain, and i
don't understand how it works -- can't a user just run a different
shell?)

 
 
 

linux stability problem

Post by Albert D. Cahal » Sun, 06 Apr 1997 04:00:00




>> It seems linux should have a certain small amount of swap space
>> permanently reserved for the superuser. That way the sysadmin
>> can recover the system when a user program malloc()'s out of
>> control.

> i think the problem with this proposal is that there will be many
> processes owned by root (e.g., daemons) that will be simultaneously
> competing for this space with the interactive process you *want* to
> get it.

Sick idea: Put a rescue shell in the kernel. When active, all other
processes stop. Only the rescue shell gets scheduled. The rescue
shell never runs out of memory or swap because it uses preallocated
kernel memory. Obviously this would be a config option! I'm sure
it would waste a bit of RAM, but it could be very useful.

Quote:> so, if you reserved 5% of the memory, then the offending user process
> would effectively be squeezing the root processes out of the 95% and
> into the 5%, with no net improvement.

That is not exactly right. Give root RAM as needed, even if
other processes must get killed.

Quote:> i think if it were this easy it would have been solved ages ago.

It is not easy unless you have the money to waste on excess RAM.
Normal people need overcommitment and can not reserve extra RAM.

Quote:> imho, the best solution would be to move some kind of ulimit utility
> into the kernel like quota support on filesystems.  then su could run
> some utility (or just echo > /dev/proc/ulimits) and set the various
> ulimit stuff.  i also think that for this to be useful, it should also
> allow precedence to certain users, or, better yet, groups.

Right now Linux does not track memory use on a per-user basis.
(there is some per-process tracking) I'd like to have the kernel
request quota information from a daemon whenever the UID changes
to something new. Then there would be no need to worry about
forgetting a limit somewhere, such as on procmail.
--
--
Albert Cahalan <acahalan at cs.uml.edu> My address may be mangled to
avoid junk email. Please check it if you wish to respond to a news post.
 
 
 

1. linux stability problems

I wrote a program to test the OS stablity and I have seen that linux is
not too stabile.
I would like someone to explain me why the system crushed.

The program is:

main()
{
int *i;
while (1)
{
i=(int*) malloc(100*sizeof(int));
The system finish memory and do not kill the stupid program I wrote and
then become unrecuperable.

2. Help with mouse.

3. A real linux stability problem --- please read

4. kernel-fb and Cd-Rom SCSI in mandrake 6.0

5. linux stability problem

6. Serial Port used for modem

7. linux stability problems

8. libphp4.so - Apache + PHP

9. Linux Stability Problems in High Load Environment

10. linux stability problems

11. Hardware stability (was: Linux stability???)

12. Stability problems Linux 2.0.18 with AMD K5 100

13. FreeBSD 4.4 stability shutdown problem