In AIX, a nonzero return from malloc does not guarantee that the
memory requested has been allocated to the process. Neither does
a successful launch of a program containing a large static buffer
guarantee that the buffer may be fully usable.
I was quite surprised to discover this in a recent conversation that
my program had with the malloc function. The dialog when something
like this:
program: Hey malloc! I'd like a few megabytes please.
malloc: No problem, here you go -- it's at this address.
program: Thanks very much. I'll just go and fill this memory, and --
malloc: Just kidding! I didn't really give you the memory. And by
the way, you're going to die if you continue to use it. Ha!
It turns out that when my program went to fill in the memory, it was sent
a SIGKILL by AIX. This was clearly a bug of some kind, so I reported it.
I sent a demo program that just malloc'd as big a buffer as it could get,
and then started zeroing a byte every 4k. When I started up a couple of
them, they all died. Some got bus errors, others were sent SIGKILL.
IBM's response is that this is working as designed!
IBM said that they don't allocate the paging space until it's needed, in
order to accommodate programs which ask for large amounts of memory that
they never use (some nonsense about sparse arrays). I said that I need
to know if the memory is really there before I start using it. They
referred me to some sample code. If it weren't so sad, it would have
been funny.
The sample code contained a malloc wrapper that returns nonzero only if
the memory is actually there. It sets up a handler for SIGDANGER, then
calls malloc. If it returns nonzero (a virtual certainty) it proceeds
to touch pages of memory. If it gets SIGDANGER, the handler longjmps to
code which "untouches" the memory, frees the successfully-malloc'd-but-
not-really-there buffer and causes the malloc wrapper to return zero.
I have questions:
1) Has anyone seen a system where static memory may not really be
there, or where a nonzero malloc doesn't guarantee the successful
usage of the memory?
2) Has anyone heard of SIGDANGER before?
3) Read the POSIX standard for malloc from a "legal" standpoint.
If IBM claims POSIX compliance, can I use this as a weapon?
4) Even if I use this malloc wrapper everywhere in my own code,
how do I deal with third-party code I purchase that calls the
unwrapped malloc?