hammer: MAP_32BIT

hammer: MAP_32BIT

Post by Ulrich Dreppe » Sat, 10 May 2003 09:40:08



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

To allocate stacks for the threads in nptl we currently use MAP_32BIT to
make sure we get <4GB addresses for faster context switching time.  But
once the address space is allocated we have to resort to not using the
flag.  This means we have to make 2 mmap() calls, one with MAP_32BIT and
if it fails another one without.

It would be much better if there would also be a MAP_32PREFER flag with
the appropriate semantics.  The failing mmap() calls seems to be quite
expensive so programs with many threads are really punished a lot.

- --
- --------------.                        ,-.            444 Castro Street
Ulrich Drepper \    ,-----------------'   \ Mountain View, CA 94041 USA
Red Hat         `--' drepper at redhat.com `---------------------------
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQE+u1pF2ijCOnn/RHQRAk2IAKDAzXZUOsxMPAKkK9ivOz8o6zAaHQCeMC24
ysih3QB/I1w5MNXEIxNs284=
=2cet
-----END PGP SIGNATURE-----

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

hammer: MAP_32BIT

Post by Andi Klee » Sat, 10 May 2003 12:00:34



> It would be much better if there would also be a MAP_32PREFER flag with
> the appropriate semantics.  The failing mmap() calls seems to be quite
> expensive so programs with many threads are really punished a lot.

That's just an inadequate data structure. It does an linear search of the
VMAs and you probably have a lot of them. Before you add kludges like this
better fix the data structure for fast free space lookup.

MAP_32BIT currently limits to the first 2GB only. That's needed because
most programs use it to allocate modules for the small code model and that
only supports 2GB (poster child for that is the X server) But for your
application 4GB would be better. But adding another MAP_32BIT_4GB or so
would be quite ugly. I considered making the address where mmap starts searching
(TASK_UNMAPPED_BASE) settable using a prctl.

In some vendor kernels it's already in /proc/pid/mapped_base, but that is
quite costly to change. That would probably give you the best of both, Just
set it to a low value for the thread stacks and then reset it to the default.

I guess that would be the better solution for your stacks.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

hammer: MAP_32BIT

Post by mi.. » Sat, 10 May 2003 13:30:15


Quote:Andi Kleen writes:


 > > It would be much better if there would also be a MAP_32PREFER flag with
 > > the appropriate semantics.  The failing mmap() calls seems to be quite
 > > expensive so programs with many threads are really punished a lot.
 >
 > That's just an inadequate data structure. It does an linear search of the
 > VMAs and you probably have a lot of them. Before you add kludges like this
 > better fix the data structure for fast free space lookup.
 >
 > MAP_32BIT currently limits to the first 2GB only. That's needed because
 > most programs use it to allocate modules for the small code model and that
 > only supports 2GB (poster child for that is the X server) But for your
 > application 4GB would be better. But adding another MAP_32BIT_4GB or so
 > would be quite ugly. I considered making the address where mmap starts searching
 > (TASK_UNMAPPED_BASE) settable using a prctl.

I have a potential use for mmap()ing in the low 4GB on x86_64.
Sounds like your MAP_32BIT really is MAP_31BIT :-( which is too limiting.
What about a more generic way of indicating which parts of the address
space one wants? The simplest that would work for me is a single byte
'nrbits' specifying the target address space as [0 .. 2^nrbits-1].
This could be specified on a per-mmap() basis or as a settable process attribute.

/Mikael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

hammer: MAP_32BIT

Post by Andi Klee » Sat, 10 May 2003 13:50:10



> I have a potential use for mmap()ing in the low 4GB on x86_64.

Just use MAP_32BIT

Quote:> Sounds like your MAP_32BIT really is MAP_31BIT :-( which is too limiting.
> What about a more generic way of indicating which parts of the address
> space one wants? The simplest that would work for me is a single byte
> 'nrbits' specifying the target address space as [0 .. 2^nrbits-1].
> This could be specified on a per-mmap() basis or as a settable process attribute.

On x86-64 an mmap extension for that would be fine, but on i386 you get
problems because mmap64() already maxes out the argument limit and you
cannot add more.

You could only implement it with a structure in memory pointed to by an
argument, which would be ugly.

prctl is probably better. You really want [start; end] right ?

Pity that task_struct is already so bloated, so every new entry hurts.

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

hammer: MAP_32BIT

Post by Andi Klee » Sat, 10 May 2003 14:20:11



> Andi Kleen writes:


>  > > I have a potential use for mmap()ing in the low 4GB on x86_64.

>  > Just use MAP_32BIT

> Will that be corrected to use the full 4GB space? 2GB is too small.

That would break the X server.

But what you can do is to use mmap(0x1000, ....) and free the memory
again if the result is bigger than 4GB. If you pass an non zero value
as first argument but not MAP_FIXED it'll use the address argument
as starting point for the search.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

hammer: MAP_32BIT

Post by H. Peter Anvi » Sat, 10 May 2003 19:40:08




In newsgroup: linux.dev.kernel

Quote:

> MAP_32BIT currently limits to the first 2GB only. That's needed because
> most programs use it to allocate modules for the small code model and that
> only supports 2GB (poster child for that is the X server) But for your
> application 4GB would be better. But adding another MAP_32BIT_4GB or so
> would be quite ugly. I considered making the address where mmap starts searching
> (TASK_UNMAPPED_BASE) settable using a prctl.

MAP_31BIT would have been a better name...

        -hpa
--

"Unix gives you enough rope to shoot yourself in the foot."
Architectures needed: ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

hammer: MAP_32BIT

Post by Ulrich Dreppe » Sat, 10 May 2003 19:50:10


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


> That's just an inadequate data structure. It does an linear search of the
> VMAs and you probably have a lot of them. Before you add kludges like this
> better fix the data structure for fast free space lookup.

If you mean the code in arch_get_unmapped_area(), yes, this needs
fixing.  In fact, Ingo has already a patch which brings back the
performance of thread creation to what we had back in September/October.

Quote:> In some vendor kernels it's already in /proc/pid/mapped_base, but that is
> quite costly to change. That would probably give you the best of both, Just
> set it to a low value for the thread stacks and then reset it to the default.

> I guess that would be the better solution for your stacks.

Are you sure this is the best solution?  It means the mmap regions for
restricted 31/32 bit addresses and that for the normal, unrestricted
mapping is continuous.  This removes a lot of freedom in deciding where
the unrestricted mappings are best located and it would make programs
using threads have a very different memory layout.  Not that it should
make any difference; but I can here /them/ already scream that this
breaks applications.

My kernel-uninformed opinion would be to keep the settings separate.

Oh, and please rename MAP_32BIT to MAP_31BIT.  This will save nerves on
all sides.

- --
- --------------.                        ,-.            444 Castro Street
Ulrich Drepper \    ,-----------------'   \ Mountain View, CA 94041 USA
Red Hat         `--' drepper at redhat.com `---------------------------
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQE+u+fi2ijCOnn/RHQRAqeBAKC3ZlSCNcw3f7SXahvxRc0WMupYgwCgyBGy
fMqzCxWcx90e002CNUQqwgM=
=LDJf
-----END PGP SIGNATURE-----

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

hammer: MAP_32BIT

Post by H. Peter Anvi » Sat, 10 May 2003 20:20:11




In newsgroup: linux.dev.kernel


> > I have a potential use for mmap()ing in the low 4GB on x86_64.

> Just use MAP_32BIT

> > Sounds like your MAP_32BIT really is MAP_31BIT :-( which is too limiting.
> > What about a more generic way of indicating which parts of the address
> > space one wants? The simplest that would work for me is a single byte
> > 'nrbits' specifying the target address space as [0 .. 2^nrbits-1].
> > This could be specified on a per-mmap() basis or as a settable process attribute.

> On x86-64 an mmap extension for that would be fine, but on i386 you get
> problems because mmap64() already maxes out the argument limit and you
> cannot add more.

How about this: since the address argument is basically unused anyway
unless MAP_FIXED is set, how about a MAP_MAXADDR which interprets the
address argument as the highest permissible address (or lowest
nonpermissible address)?

        -hpa
--

"Unix gives you enough rope to shoot yourself in the foot."
Architectures needed: ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

hammer: MAP_32BIT

Post by Ulrich Dreppe » Sat, 10 May 2003 21:30:21


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


> How about this: since the address argument is basically unused anyway
> unless MAP_FIXED is set, how about a MAP_MAXADDR which interprets the
> address argument as the highest permissible address (or lowest
> nonpermissible address)?

You miss the point of my initial mail: I need a way to say "preferrably
32bit address, otherwise give me what you have".  MAP_32BIT already
provides a way to require 32 bit addresses.

- --
- --------------.                        ,-.            444 Castro Street
Ulrich Drepper \    ,-----------------'   \ Mountain View, CA 94041 USA
Red Hat         `--' drepper at redhat.com `---------------------------
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQE+vACE2ijCOnn/RHQRAl3rAKCYgj3LqvIDJ8Ny3pnii8bBvsbwrQCdGkg4
pnFnBmubkRnnsVfBSjDBBWQ=
=P8SV
-----END PGP SIGNATURE-----

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

hammer: MAP_32BIT

Post by H. Peter Anvi » Sat, 10 May 2003 23:00:18



> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1


>>How about this: since the address argument is basically unused anyway
>>unless MAP_FIXED is set, how about a MAP_MAXADDR which interprets the
>>address argument as the highest permissible address (or lowest
>>nonpermissible address)?

> You miss the point of my initial mail: I need a way to say "preferrably
> 32bit address, otherwise give me what you have".  MAP_32BIT already
> provides a way to require 32 bit addresses.

No, it requires 31-bit addresses, and there was a discussion about how
some things need 31-bit and some 32-bit addresses.  There might also be
a need for 39-bit addresses, to be compatible with Linux 2.4.

MAP_MAXADDR_ADVISORY?

        -hpa

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

hammer: MAP_32BIT

Post by Ulrich Dreppe » Sat, 10 May 2003 23:50:10


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


> No, it requires 31-bit addresses, and there was a discussion about how
> some things need 31-bit and some 32-bit addresses.

That's completely irrelevant to my point.  Whether MAP_32BIT actually
has a 31 bit limit or not doesn't matter, it's limited as well in the
possible mmap blocks it can return.

The only thing I care about is to have a hint and not a fixed
requirement for mmap().  All your proposals completely ignored this.

- --
- --------------.                        ,-.            444 Castro Street
Ulrich Drepper \    ,-----------------'   \ Mountain View, CA 94041 USA
Red Hat         `--' drepper at redhat.com `---------------------------
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQE+vCFk2ijCOnn/RHQRAnw1AKChzyuZ3g9iXAX5wH088rhko/s8YgCgku12
CayuZsLJGzPO//WCJVWyLxk=
=rkBk
-----END PGP SIGNATURE-----

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

hammer: MAP_32BIT

Post by H. Peter Anvi » Sun, 11 May 2003 00:20:05



> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1


>>No, it requires 31-bit addresses, and there was a discussion about how
>>some things need 31-bit and some 32-bit addresses.

> That's completely irrelevant to my point.  Whether MAP_32BIT actually
> has a 31 bit limit or not doesn't matter, it's limited as well in the
> possible mmap blocks it can return.

> The only thing I care about is to have a hint and not a fixed
> requirement for mmap().  All your proposals completely ignored this.

Yes, but this is irrelevant to *MY* point... this discussion spawned a
side discussion, and somehow you're upset that it's not addressing your
concern but a different one... seems a bit ridiculous!

Anyway, I already posted that if we're adding MAP_MAXADDR we could also
add MAP_MAXADDR_ADVISORY or something similar to that.  On the other
hand, how big of a performance issue is it really to call mmap() again
in the failure scenario *only*?

        -hpa

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

hammer: MAP_32BIT

Post by Timothy Mille » Sun, 11 May 2003 00:20:09



> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1


>>No, it requires 31-bit addresses, and there was a discussion about how
>>some things need 31-bit and some 32-bit addresses.

> That's completely irrelevant to my point.  Whether MAP_32BIT actually
> has a 31 bit limit or not doesn't matter, it's limited as well in the
> possible mmap blocks it can return.

> The only thing I care about is to have a hint and not a fixed
> requirement for mmap().  All your proposals completely ignored this.

If your program is capable of handling an address with more than 32
bits, what point is there giving a hint?  Either your program can handle
64-bit pointers or it cannot.  Any program flexible enough to handle
either size dynamically would expend enough overhead checking that it
would be worse than if it just made a hard choice.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

hammer: MAP_32BIT

Post by Ulrich Dreppe » Sun, 11 May 2003 00:30:17


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


> On the other
> hand, how big of a performance issue is it really to call mmap() again
> in the failure scenario *only*?

Just look at the code, it's very expensive.  In the moment the mmap code
has to sequentially look at the VMAs in question.  If it fails it means
it walked the entire data structure without success.  Ingo's patch does
not address this, it just makes successful allocation usually fast again.

- --
- --------------.                        ,-.            444 Castro Street
Ulrich Drepper \    ,-----------------'   \ Mountain View, CA 94041 USA
Red Hat         `--' drepper at redhat.com `---------------------------
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQE+vCmt2ijCOnn/RHQRAsUeAJ9gGIwIK+QKpSz15YDEaB5aISBwowCgjReV
WSvgiDRcLX5bpla/Agikmj0=
=NSIn
-----END PGP SIGNATURE-----

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

hammer: MAP_32BIT

Post by H. Peter Anvi » Sun, 11 May 2003 00:30:18



> If your program is capable of handling an address with more than 32
> bits, what point is there giving a hint?  Either your program can handle
> 64-bit pointers or it cannot.  Any program flexible enough to handle
> either size dynamically would expend enough overhead checking that it
> would be worse than if it just made a hard choice.

The purpose is that there is a slight task-switching speed advantage if
the address is in the bottom 4 GB.  Since this affects every process,
and most processes use very little TLS, this is worthwhile.

This is fundamentally due to a K8 design flaw.

        -hpa

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

1. Hammer thread fixes

It's incorrect like I told you last time. arg 4 is in r10. Linus please don't
apply.

The clone prototype is

        int clone(int flags, unsigned long newsp, void *parent_tid, void *child_tid) ;

        rax: __NR_clone
        rdi: flags
        rsi: newsp
        rdx: parent_tid
        r10: child_tid

See appendix A of the x86-64 ABI for details.    The kernel ABI
is different from the user space ABI because of the SYSCALL clobbers.

For exit_group please wait for my next merge.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

2. OPTi Chipset problem known?

3. someone hammer my server, please help!

4. disk media free space under Veritas

5. Anti-hammering with IPtables?

6. I seem to have broken KDE 1.1.1

7. traceroute port hammering

8. pin assignments on thinkpad 850

9. a person is hammering my port 23

10. FTP Server is being hammered

11. Anti-hammering with Proftpd?

12. Hammer - aka AIX v5

13. Hammer m/b