Ongoing 2.4 VM suckage

Ongoing 2.4 VM suckage

Post by Richard B. Johnso » Sat, 04 Aug 2001 04:00:10




> This just in: Linux 2.4 VM still useless.

> I have 2 GB main memory and 4GB swap on a 2-way intel machine running a
> variety of 2.4 kernels (we upgrade every time we have to reboot), and we
> have to power cycle the machine weekly because too much memory usage + too
> much disk I/O == thrash for hours.

> Gosh, I guess it is silly to use all of the available RAM and I/O
> bandwidth on my machines.  My company will just go out of their way to
> do less work on smaller sets of data.

Are you sure it's not just come user-code with memory leaks? I use
2.4.1 on an embeded system with no disks, therefore no swap. It does
large FFT arrays to make spectrum-analyzer pictures and it has never
seen any problems with VM, in fact never any problems that can be
blamed on the Operating System.

Try 2.4.1 and see if your problems go away. If not, you probably
have user-mode leakage.

Cheers,
* Johnson

Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips).

    I was going to compile a list of innovations that could be
    attributed to Microsoft. Once I realized that Ctrl-Alt-Del
    was handled in the BIOS, I found that there aren't any.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://www.veryComputer.com/
Please read the FAQ at  http://www.veryComputer.com/

 
 
 

Ongoing 2.4 VM suckage

Post by Richard B. Johnso » Sat, 04 Aug 2001 05:00:16


[SNIPPED...]

Quote:

> My process are not small.  They are huge.  They take up nearly all
> available memory.  And then when a lot of file I/O kicks in, they get
> swapped out in favor of RAM, then the thrashing starts, and the box goes
> to la la land.

> Are you saying that I can expect any userland process to be able to take
> the box down?

Not if you enable user quotas.

Quote:> Shit, why don't I just go back to DOS?

Because 640k doesn't hack it.

Seriously, it doesn't do any good to state that something sucks. You
need to point out the specific problem that you are experiencing.
"going to la la land.." is not quite technical enough. In fact, you
imply that the machine is still alive because of "disk thrashing".
If, in fact, you are a member of the Association for Computing Machinery
(so am I), you should know all this. Playing Troll doesn't help.

If you suspend (^Z) one of the huge tasks, does the thrashing stop?
When suspended, do you still have swap-file space?
Are you sure you have managed the user quotas so that the sum of
the user's demands for resources can't bring down the machine?

Cheers,
* Johnson

Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips).

    I was going to compile a list of innovations that could be
    attributed to Microsoft. Once I realized that Ctrl-Alt-Del
    was handled in the BIOS, I found that there aren't any.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://www.veryComputer.com/
Please read the FAQ at  http://www.veryComputer.com/

 
 
 

Ongoing 2.4 VM suckage

Post by Richard B. Johnso » Sat, 04 Aug 2001 06:10:09




> > Seriously, it doesn't do any good to state that something sucks. You
> > need to point out the specific problem that you are experiencing.
> > "going to la la land.." is not quite technical enough. In fact, you
> > imply that the machine is still alive because of "disk thrashing".
> > If, in fact, you are a member of the Association for Computing Machinery
> > (so am I), you should know all this. Playing Troll doesn't help.

> > If you suspend (^Z) one of the huge tasks, does the thrashing stop?
> > When suspended, do you still have swap-file space?
> > Are you sure you have managed the user quotas so that the sum of
> > the user's demands for resources can't bring down the machine?

> Anyone having observed this mailing list over the last year knows the
> problem I'm a talking about.  kswapd can get itself into a state where it
> consumes 100% CPU time for hours at a stretch.  During this time, the
> machine is unusable.  There is no way to kill or suspend a task because
> the shells aren't getting scheduled and they can't accept input.  During
> this time, the disks aren't running of course.  Leading up to this, the
> disks do run.  Then when kswapd steps in, they stop, or the throughput
> falls to a trickle.

> Here's a nice trick to pull on any Linux 2.4 box.  Allocate all of the RAM
> in the machine and keep it.  Now, thrash the VM by e.g. find / -exec cat
> {} \;  Watch what happens.  The kernel will try to grow and grow the disk
> cache by swapping your process out to disk.  But, there may not be enough
> room for your process and all the cache that the kernel wants, so the
> machine goes into this sort of soft-deadlock state where kswapd goes away
> for a lunch break.

Well I don't have any such problems here. I wrote this script
from your instructions. I don't know if you REALLY wanted all
the file content to go out to the screen, but I wrote it explicitly.

#!/bin/bash
#
#
SIZ=`head --lines 2 /proc/meminfo | grep Mem | cut -d' ' -f3`
cat >/tmp/try.c <<EOF
#include <stdio.h>
#include <unistd.h>
#include <malloc.h>
int main(void);
int main()
{
    char *cp;

    cp = malloc($SIZ);
    memset(cp, 0x55, $SIZ);
    pause();
    return 0;

Quote:}

EOF
gcc -Wall -O2 -o /tmp/try /tmp/try.c
/tmp/try &
find / -exec cat {} \;
# end

Script started on Thu Aug  2 16:50:47 2001
# ps -laxw | grep pause
   140     0    16     1  -1 -20    712    40 pause       S < ?   0:00 (bdflush) te
     0     0  7433     1   9   0 321056 137808 pause       S    1  0:02 /tmp/try
     0     0  7631  7626  19   0    844   240             R   p0  0:00 grep pause
# cat /proc/meminfo
        total:    used:    free:  shared: buffers:  cached:
Mem:  328048640 326234112  1814528        0  4255744 173219840
Swap: 1069268992 189992960 879276032
MemTotal:       320360 kB
MemFree:          1772 kB
MemShared:           0 kB
Buffers:          4156 kB
Cached:         169160 kB
Active:           9616 kB
Inact_dirty:     36012 kB
Inact_clean:    127688 kB
Inact_target:      780 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:       320360 kB
LowFree:          1772 kB
SwapTotal:     1044208 kB
SwapFree:       858668 kB
# exit
exit

Script done on Thu Aug  2 16:51:26 2001

As you can see, it was quite sucessful in writing to real-RAM-size
of virtual RAM, then swapping it out so other stuff could run.

You can also do:
    for(;;)
       memset(cp, 0x55, $SIZ);
... and have a very slow, but usable, system. This was from the
root account with no quotas.

Perhaps there are problems with buffers that I don't have because
I use SCSI disks for everything.

Cheers,
* Johnson

Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips).

    I was going to compile a list of innovations that could be
    attributed to Microsoft. Once I realized that Ctrl-Alt-Del
    was handled in the BIOS, I found that there aren't any.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://www.veryComputer.com/
Please read the FAQ at  http://www.veryComputer.com/

 
 
 

Ongoing 2.4 VM suckage

Post by Jakob ?stergaar » Sat, 04 Aug 2001 06:50:12




> > Well I don't have any such problems here. I wrote this script
> > from your instructions. I don't know if you REALLY wanted all
> > the file content to go out to the screen, but I wrote it explicitly.

> [snip]

> > Script started on Thu Aug  2 16:50:47 2001
> > # ps -laxw | grep pause
> >    140     0    16     1  -1 -20    712    40 pause       S < ?   0:00 (bdflush) te
> >      0     0  7433     1   9   0 321056 137808 pause       S    1  0:02 /tmp/try
> >      0     0  7631  7626  19   0    844   240             R   p0  0:00 grep pause
> > # cat /proc/meminfo
> >         total:    used:    free:  shared: buffers:  cached:
> > Mem:  328048640 326234112  1814528        0  4255744 173219840
> > Swap: 1069268992 189992960 879276032
>                              ^^^^^^^^^
> You still have almost 1GB of swap left.  I mean use all the memory in your
> box, RAM + swap.

> As I said, I expect degraded performance but not a complete meltdown.

What ?

You fill up mem and you fill up swap, and you complain the box is acting funny ??

This is a clear case of "Doctor it hurts when I ..."  -  Don't do it !

I'm interested in hearing how you would accomplish graceful performance
degradation in a situation where you have used up any possible resource on the
machine.  Transparent process back-tracking ?   What ?

--
................................................................

:.........................: putrid forms of man                :
:   Jakob ?stergaard      : See him rise and claim the earth,  :
:        OZ9ABN           : his downfall is at hand.           :
:.........................:............{Konkhra}...............:
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

Ongoing 2.4 VM suckage

Post by Jeffrey W. Bake » Sat, 04 Aug 2001 07:00:11



> You fill up mem and you fill up swap, and you complain the box is
> acting funny ??

The kernel should save whatever memory it needs to do its work.  It isn't
my problem, from userland, to worry that I take the last page in the
machine.  If the kernel needs pages to operate efficiently, it had better
reserve them and not just hand them out until it locks up.

Quote:> This is a clear case of "Doctor it hurts when I ..."  - Don't do it !

> I'm interested in hearing how you would accomplish graceful
> performance degradation in a situation where you have used up any
> possible resource on the machine.  Transparent process back-tracking ?
> What ?

Gosh, here's an idea: if there is no memory left and someone malloc()s
some more, have malloc() fail?  Kill the process that required the memory?
I can't believe the attitude I am hearing.  Userland processes should be
able to go around doing whaever the * they want and the box should stay
alive.  Currently, if a userland process runs amok, the kernel goes into
self-*ing mode for the rest of the week.

-jwb

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://www.veryComputer.com/
Please read the FAQ at  http://www.veryComputer.com/

 
 
 

Ongoing 2.4 VM suckage

Post by Rik van Rie » Sat, 04 Aug 2001 07:10:13



> Gosh, here's an idea: if there is no memory left and someone
> malloc()s some more, have malloc() fail?  Kill the process that
> required the memory? I can't believe the attitude I am hearing.
> Userland processes should be able to go around doing whaever the
> * they want and the box should stay alive.

If you have a proposal on what to do when both ram
and swap fill up and you need more memory, please
let me know.

Until then, we'll kill processes when we exhaust
both memory and swap ;)

cheers,

Rik
--
Executive summary of a recent Microsoft press release:
   "we are concerned about the GNU General Public License (GPL)"

                http://www.veryComputer.com/
http://www.veryComputer.com/://distro.conectiva.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://www.veryComputer.com/
Please read the FAQ at  http://www.veryComputer.com/

 
 
 

Ongoing 2.4 VM suckage

Post by Jeffrey W. Bake » Sat, 04 Aug 2001 07:10:13



Quote:> Hmm.  What about the OOM process killer?  Shouldn't that kick in?

You'd think so!  Maybe it stepped in and killed the kernel :)  You can't
get any information out of the machine in this state because it isn't
classic thrashing: the disks aren't running, and regular processes are
getting run VERY infrequently.

-jwb

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

Ongoing 2.4 VM suckage

Post by Pavel Zaitse » Sat, 04 Aug 2001 07:20:12



Quote:> Gosh, here's an idea: if there is no memory left and someone malloc()s
> some more, have malloc() fail?  Kill the process that required the memory?
> I can't believe the attitude I am hearing.  Userland processes should be
> able to go around doing whaever the * they want and the box should stay
> alive.  Currently, if a userland process runs amok, the kernel goes into
> self-*ing mode for the rest of the week.

Userland process shall be suspended after it reaches certain rate of
swaps per second. It may resume after short while. Userland process,
if written properly will see that malloc is failing and inform user.
If its being bad, then it will be suspended.
p.

--
Take out your recursive cannons and shoot!
110461387
http://www.veryComputer.com/
http://www.veryComputer.com/

  application_pgp-signature_part
< 1K Download
 
 
 

Ongoing 2.4 VM suckage

Post by Jeffrey W. Bake » Sat, 04 Aug 2001 07:20:14



> If you have a proposal on what to do when both ram
> and swap fill up and you need more memory, please
> let me know.

> Until then, we'll kill processes when we exhaust
> both memory and swap ;)

I'm telling you that's not what happens.  When memory pressure gets really
high, the kernel takes all the CPU time and the box is completely useless.
Maybe the VM sorts itself out but after five minutes of barely responding,
I usually just power cycle the damn thing.  As I said, this isn't a
classic thrash because the swap disks only blip perhaps once every ten
seconds!

You don't have to go to extremes to observe this behavior.  Yesterday, I
had one box where kswapd used 100% of one CPU for 70 minutes straight,
while user process all ran on the other CPU.  All RAM and half swap was
used, and I/O was heavy.  The machine had been up for 14 days.  I just
don't understand why kswapd needs to run and run and run and run and run
...

I'm very familiar with what should happen on a Unix box when user
processes get huge.  On my FreeBSD and Solaris machines, everything goes
to shit for a few minutes and then it comes back.  Linux used to work that
way too, but I can't count on the comeback in 2.4.

-jwb

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

Ongoing 2.4 VM suckage

Post by Jakob ?stergaar » Sat, 04 Aug 2001 07:30:12




> > You fill up mem and you fill up swap, and you complain the box is
> > acting funny ??

> The kernel should save whatever memory it needs to do its work.  It isn't
> my problem, from userland, to worry that I take the last page in the
> machine.  If the kernel needs pages to operate efficiently, it had better
> reserve them and not just hand them out until it locks up.

Sure, I agree,  to an extent.

If I start 50 CPU-bound jobs on my one-processor machine, I don't want the
kernel to tell me "no, you probably didn't mean to do that, I'll kill 40 of
your jobs so the others will go faster".    Same with resource usage - it's not
the kernel's job to implement that kind of policy - you have ulimits for
limiting your users, and if it's your own machine you should have enough
knowledge to know that deliberately using up every resource in the machine is
going to cause a resource shortage.

It is possible that there is a real problem and the kernel doesn't operate
efficiently in your case - I won't argue with that.   But you cannot expect
your system to perform very well if you use up all resources - maybe if you
hit a real bug in your case, and if someone fixes it, the kernel will operate
efficiently under those circumstances - but userspace will *not* operate
very well if you want the OOM killer to regularly kill "production" jobs etc.

At least, you must be doing another kind of production that what I'm used to
  :)

Quote:

> > This is a clear case of "Doctor it hurts when I ..."  - Don't do it !

> > I'm interested in hearing how you would accomplish graceful
> > performance degradation in a situation where you have used up any
> > possible resource on the machine.  Transparent process back-tracking ?
> > What ?

> Gosh, here's an idea: if there is no memory left and someone malloc()s
> some more, have malloc() fail?

Actually, having malloc() fail is not that simple  :)

Quote:> Kill the process that required the memory?

Yes, you're perfectly right here.  If there's a critical shortage the OOM
killer should strike.

However - it should only strike the offending process (detecting that is hard
enough).  And it should not be possible for an attacker or untrusted user to
cause the OOM killer to kill anything but his own jobs.

Quote:> I can't believe the attitude I am hearing.  Userland processes should be
> able to go around doing whaever the * they want and the box should stay
> alive.

No offense was intended.

But if this things were really so simple, they would have been in the kernel
for ages.

I'm tempted to say:  Well your ideals seem to correlate well with the general
ideals of the LKML wrt. VM and OOM - it'd be great if you could post a patch to
fix it all properly     :)

We all want:  Perfect performance in both normal and resource-starved cases,
an OOM killer that strikes fairly when necessary and only when necessary,  a
userspace that's not just fool-proof but "very fool-proof", etc. etc.

Quote:>  Currently, if a userland process runs amok, the kernel goes into
> self-*ing mode for the rest of the week.

We know.

What is your suggestion for tackling this problem ?

--
................................................................

:.........................: putrid forms of man                :
:   Jakob ?stergaard      : See him rise and claim the earth,  :
:        OZ9ABN           : his downfall is at hand.           :
:.........................:............{Konkhra}...............:
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://www.veryComputer.com/
Please read the FAQ at  http://www.veryComputer.com/

 
 
 

Ongoing 2.4 VM suckage

Post by BERECZ Szabolc » Sat, 04 Aug 2001 08:00:12




> > I'm telling you that's not what happens.  When memory pressure
> > gets really high, the kernel takes all the CPU time and the box
> > is completely useless. Maybe the VM sorts itself out but after
> > five minutes of barely responding, I usually just power cycle
> > the damn thing.  As I said, this isn't a classic thrash because
> > the swap disks only blip perhaps once every ten seconds!

this is exactly what's happening here. (as I wrote in the 'kswapd eats the
cpu without swap' mail)

the network work's here too, but I can't do anything. I can't even switch
between VT-s
but it's really hard to reproduce.

Bye,
Szabi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

Ongoing 2.4 VM suckage

Post by Jeremy Linto » Sat, 04 Aug 2001 08:50:06


Quote:> I'm telling you that's not what happens.  When memory pressure gets really
> high, the kernel takes all the CPU time and the box is completely useless.
> Maybe the VM sorts itself out but after five minutes of barely responding,
> I usually just power cycle the damn thing.  As I said, this isn't a
> classic thrash because the swap disks only blip perhaps once every ten
> seconds!

> You don't have to go to extremes to observe this behavior.  Yesterday, I
> had one box where kswapd used 100% of one CPU for 70 minutes straight,
> while user process all ran on the other CPU.  All RAM and half swap was
> used, and I/O was heavy.  The machine had been up for 14 days.  I just
> don't understand why kswapd needs to run and run and run and run and run

    Actually, this sounds very similar to a problem I see on a somewhat
regular basis with a very memory hungry module running in the machine.
Basically the module eats up about a quarter of system memory. Then a user
space process comes along and uses a big virtual area (about 1.2x the total
physical memory in the box). If the user space process starts to write to a
lot of the virtual memory it owns, then the box basically slows down to the
point where it appears to have locked up, disk activity goes to 1 blip every
few seconds. On the other hand if the user process is doing mostly read
accesses to the memory space then everything is fine.

    I can't even break into gdb when the box is 'locked up' but before it
locks up I notice that there is massive contention for the pagemap_lru_lock
(been running a hand rolled kernel lock profiler) from two different
places... Take a look at these stack dumps.

Kswapd is in page_launder.......
#0  page_launder (gfp_mask=4, user=0) at vmscan.c:592
#1  0xc013d665 in do_try_to_free_pages (gfp_mask=4, user=0) at vmscan.c:935
#2  0xc013d73b in kswapd (unused=0x0) at vmscan.c:1016
#3  0xc01056b6 in kernel_thread (fn=0xddaa0848, arg=0xdfff5fbc, flags=9) at
process.c:443
#4  0xddaa0844 in ?? ()

And my user space process is desperatly trying to get a page from a page
fault!

#0  reclaim_page (zone=0xc0285ae8) at
/usr/src/linux.2.4.4/include/asm/spinlock.h:102
#1  0xc013e474 in __alloc_pages_limit (zonelist=0xc02864dc, order=0,
limit=1, direct_reclaim=1) at page_alloc.c:294
#2  0xc013e581 in __alloc_pages (zonelist=0xc02864dc, order=0) at
page_alloc.c:383
#3  0xc012de43 in do_anonymous_page (mm=0xdfb88884, vma=0xdb45ce3c,
page_table=0xc091e46c, write_access=1, addr=1506914312)
    at /usr/src/linux.2.4.4/include/linux/mm.h:392
#4  0xc012df40 in do_no_page (mm=0xdfb88884, vma=0xdb45ce3c,
address=1506914312, write_access=1, page_table=0xc091e46c)
    at memory.c:1237
#5  0xc012e15b in handle_mm_fault (mm=0xdfb88884, vma=0xdb45ce3c,
address=1506914312, write_access=1) at memory.c:1317
#6  0xc01163dc in do_page_fault (regs=0xdb2d3fc4, error_code=6) at
fault.c:265
#7  0xc01078b0 in error_code () at af_packet.c:1881
#8  0x40040177 in ?? () at af_packet.c:1881

The spinlock counts are usually on the order of ~1million spins to get the
lock!!!!!!

                                                        jlinton

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

Ongoing 2.4 VM suckage

Post by jlna.. » Sat, 04 Aug 2001 22:10:08




> > I'm telling you that's not what happens.  When memory pressure
> > gets really high, the kernel takes all the CPU time and the box
> > is completely useless. Maybe the VM sorts itself out but after
> > five minutes of barely responding, I usually just power cycle
> > the damn thing.  As I said, this isn't a classic thrash because
> > the swap disks only blip perhaps once every ten seconds!

> What kind of workload are you running ?

> We could be dealing with some weird artifact of
> virtual page scanning here, or with a strange
> side effect of recent VM changes ...

Rik,
    FWIW, I am seeing this sort of thing too, though I am running a 2.4.5
kernel so I am a little out of date.  Its a large machine with 2G of ram,
4G of swap, and 2 CPUs.  You dont have to actually use all the memory either.
When my process gets to about 1.5G and starts doing lots of file I/O, the
machine can just disappear for several minutes.  I will be sshed in and
I can not even get my shell to give me a new prompt when I hit return.  It
will eventually sort it self out, but it might take 15 minutes.  I will try
and get a more recent kernel installed, but the machine is not under my
control, so I dont get to decide when that happens.

Thanks,

Jim
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/