2.4.16 memory badness (reproducible)

2.4.16 memory badness (reproducible)

Post by Hugh Dickin » Thu, 13 Dec 2001 04:10:06

> So I don't know if it's a symptom or a cause, but modify_ldt seems to be
> triggering the problem. Not being a kernel hacker, I leave the analysis
> of this to those who are.

> home[1029]:/home/orf% free
>              total       used       free     shared    buffers     cached
> Mem:       1029772     967096      62676          0     443988      98312
> -/+ buffers/cache:     424796     604976
> Swap:      2064344          0    2064344

> modify_ldt(0x1, 0xbffff1fc, 0x10)       = -1 ENOMEM (Cannot allocate memory)

I believe this error comes, not from a (genuine or mistaken) shortage
of free memory, but from shortage or fragmentation of vmalloc's virtual
address space.  Does patch below (to 2.4.17-pre4-aa1 since I think that's
what you tried last; easily adaptible to other trees) doubling vmalloc's
address space (on your 1GB machine or larger) make any difference?
Perhaps there's a vmalloc leak and this will only delay the error.


--- 1704aa1/arch/i386/kernel/setup.c    Tue Dec 11 15:22:53 2001

  * 128MB for vmalloc and initrd
-#define VMALLOC_RESERVE        (unsigned long)(128 << 20)
+#define VMALLOC_RESERVE        (unsigned long)(256 << 20)
 #define MAXMEM         (unsigned long)(-PAGE_OFFSET-VMALLOC_RESERVE)
 #define ORDER_DOWN(x)  ((x >> (MAX_ORDER-1)) << (MAX_ORDER-1))

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.4.16 memory badness (reproducible)

Post by Holger Lubit » Wed, 19 Dec 2001 23:40:06

Andrea Arcangeli proclaimed:

Quote:> He always get vmalloc failures, this is way too suspect. If the VM
> memory balancing was the culprit he should get failures with all the
> other allocations too. So it has to be a problem with a shortage of the
> address space available to vmalloc, not a problem with the page
> allocator.

Leigh pointed me to your post in reply to another thread (modify_ldt
failing on highmem machine).

Is there any special vmalloc handling on highmem kernels? I only run
into the problem if I am using high memory support in the kernel. I
haven't been able to reproduce the problem with 896M or less, which
strikes me as slightly odd. Why does _more_ memory trigger "no memory"

The problem is indeed not vm specific. the last -ac kernel shows the
problem, too (and that one still has the old vm, doesn't it?)

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


1. Kernel "memory leak" ? [2.4.16]

Hi, I have a 2.4.16 based SUSE system, which is running the
default SUSE 2.4.16 SMP-64GB kernel build, with:
        -756M Phy RAM
        - SMP (2x PIII)
        -Oracle 9.0.1
        -Oracle I/O mostly through raw partitions on IDE drv.
        -Oracle config'ed to buffer heavily.

Every night, an Oracle job runs for about two hours that hammers
the drives and memory quite thoroughly.  Oracle finishes and the
box mostly quiesces at that point.

For the last several nights, I've noticed the swap usage steadily
go up.  Hmm, thinking that Oracle was just holding buffers open,
I'd cycle all Oracle processes, but it still did not fix.  Actually,
from top and ps, there really wasn't anything very memory intensive
on the box, and about 250M of non-buffer, non-cache memory was being
used by processes.  What I mean by this figure is used memory -
buffers - cache, i.e. 111,176KB in the report below:

             total       used       free     shared    buffers     cached
Mem:        772920     660940     111980          0       6972     542792
-/+ buffers/cache:     111176     661744
Swap:       774640          0     774640

(In this report I see a lot of cache being used; I'm fine with that, I
just did a lot of file I/O).

So I got on my serial console, did an "init 1", which shut off all user
processes except for a single bash.  I also shut off swap.  I did a "free",
and hmm, i see the "used" figure (used - buffers - cache) is still over
200MB but the box ain't doing a darn thing!  I wish I cut&paste that report.
The box was up about 10 days at this point. There was several 100MB used
for cache, but that doesn't bother me as much as the massive usage of
memory to run next to nothing.

Is this abnormal?

Well, went back to init 5, and rebuilt my kernel with the same config,
except I turned off 64GB support (PAE), as I don't need it with only
756M.  Installed the new kernel, and rebooted...

Just to experiment, I rebooted into run level 1, and observed the
"used - buffer - cache" to be about 9MB.  Nice and lean, sounds good.
I then bounced over to run level 5, built a kernel and hammered at
Oracle, then went back to run level 1, to observer the "used - buffer -
cache" figure climb several MB.  Repeated sequences of the above steps
caused several MB increments in that figure, even though only one
shell was running.

Is the kernel somehow leaking memory?  I understand using the buffer
cache aggressively as a policy decision, but reporting memory used
when in fact it can't be accounted for seems wrong.

is this a known prob or a prob with my understanding of what is going on


2. Samba broken in R5

3. VM system in 2.4.16 doesn't try hard enough for user memory...

4. dead monitor - help!!

5. Memory management problems in 2.4.16

6. Mounting SCO on Linux, Lilo -> SCO

7. buffer/memory strangeness in 2.4.16 / 2.4.17pre2

8. Windows-Linux-Internet

9. 2.4.16: Out of memory - when still more than 100MB free

10. Large Amount of Kernel Memory on 2.4.16 Consumed by Kiobufs

11. 2.4.16 not loading SCSI module

12. Patch: Fix serial module use count (2.4.16 _and_ 2.5)

13. oops with kjournald in SMP 2.4.16