Loading an executable from within an executable

Loading an executable from within an executable

Post by John-David Wellm » Tue, 13 Jun 2000 04:00:00



Hi,
    I am trying to port a program I wrote on an IBM AIX operating system
to a linux system, and I have a bit of a problem I have yet to find a
solution to.  In this program, I use an AIX Base Operating System subroutine
called load to load an executable program into memory.  This load subroutine
effectively invokes the linker/loader and brings the program into the
data memory space of the current process, returning a pointer to the
beginning of the program text area (more or less).  This lets one then
examine the loaded program image in memory, etc.

    The problem I am having is that I cannot find a similar facility within
linux, and I'm getting a bit stumped as to how one might be able to do this.
So, the question: is anyone familiar with a facility, etc. that allows one
executing process to load an executable program into memory, _AND_ to
automatically resolve the relocatable linkages in the file, etc. as if the
program were loaded by the loader for execution?  If not, can anyone tell
me any way to even load an executable program image into memory that would
even approximate this?

        Thanks for any pointers,

                J-D

--
J-D Wellman   IMB Research  :  wellman at watson <dot> ibm <dot> com
--
------------------------------------------------------------------------------

IBM T. J. Watson Research Center                phone: 914-945-2523
P.O. Box 218, Yorktown Hts, NY, 10598           fax:   914-945-4469

 
 
 

Loading an executable from within an executable

Post by John Reise » Tue, 13 Jun 2000 04:00:00


man dlopen
(but the usual conflict of fixed non-relocatable addresses may require
the use of .so "shared library" modules instead of a.elf "executable"
modules.)

--


 
 
 

Loading an executable from within an executable

Post by Norman Blac » Tue, 13 Jun 2000 04:00:00


Quote:> man dlopen
> (but the usual conflict of fixed non-relocatable addresses may require
> the use of .so "shared library" modules instead of a.elf "executable"
> modules.)

While executables by default are linked without relocation information I
believe you can link an executable with relocation information and the
loader should be able to move the image if necessary.

--
Norman Black
Stony Brook Software
the reply, fubar => ix.netcom

 
 
 

Loading an executable from within an executable

Post by John-David Wellm » Thu, 15 Jun 2000 04:00:00


  Okay, dlopen looked to be pretty helpful, until I discovered that the
symbols one can find (via dlsym calls) all have to be exported symbols.
The load subroutine call in AIX simply loads the executable and points
to the first loaded text address.  The dlopen call seems to load the
executable to somewhere, and return a cryptic handle that I cannot figure out
how to use to get back to something with physical meaning, unless I
recompile the object I want to dlopen and export some symbol I'd like
to dlsym.  I don't think this is possible as I don't have access to the
code to recompile this executable, and I have not found a run-time relinking
routine in the linux world that would allow me to do this.  

  Thus, does anyone know how to get either the address at which the executable
is loaded (from dlopen, I think I can then work around other problems) or
to get a non-exported symbol, OR to relink an executable to make something
(e.g. main) an exported symbol so dlopen and dlsym will be helpful?

                Thanks,

--
------------------------------------------------------------------------------

IBM T. J. Watson Research Center                phone: 914-945-2523
P.O. Box 218, Yorktown Hts, NY, 10598           fax:   914-945-4469

 
 
 

Loading an executable from within an executable

Post by John Reise » Thu, 15 Jun 2000 04:00:00


The return type of dlopen() is really Elf32_Ehdr *.
See  /usr/include/elf.h .

--

 
 
 

Loading an executable from within an executable

Post by Norman Blac » Thu, 15 Jun 2000 04:00:00


Details...

Get the address of what, specifically, and how do you expect to use it.
A shared object has init and termination code which are just normal
procedures and after init the shared object just sits there waiting for some
function within it to be called. An executable starts and does not stop
until it terminates itself. Given the above... what does the system loader
do when loading an executable via dlopen. I do not know. Does anyone else?
Calling something not exported is dangerous since the author of the shared
object can do anything they want without regards to standards beyond the
public interface to the shared object.
If the symbols you want is not exported (in the .dynsym section), then if
the shared object has a symbol table the symbol will most likely be there.
This means you will have to parse the executable image file, and adjust the
symbol address retrieved from the symbol table to the loaded address of the
image loaded after calling dlopen.

--
Norman Black
Stony Brook Software
the reply, fubar => ix.netcom


>   Okay, dlopen looked to be pretty helpful, until I discovered that the
> symbols one can find (via dlsym calls) all have to be exported symbols.
> The load subroutine call in AIX simply loads the executable and points
> to the first loaded text address.  The dlopen call seems to load the
> executable to somewhere, and return a cryptic handle that I cannot figure
out
> how to use to get back to something with physical meaning, unless I
> recompile the object I want to dlopen and export some symbol I'd like
> to dlsym.  I don't think this is possible as I don't have access to the
> code to recompile this executable, and I have not found a run-time
relinking
> routine in the linux world that would allow me to do this.

>   Thus, does anyone know how to get either the address at which the
executable
> is loaded (from dlopen, I think I can then work around other problems) or
> to get a non-exported symbol, OR to relink an executable to make something
> (e.g. main) an exported symbol so dlopen and dlsym will be helpful?

> Thanks,

> --
> --------------------------------------------------------------------------
----

> IBM T. J. Watson Research Center phone: 914-945-2523
> P.O. Box 218, Yorktown Hts, NY, 10598 fax:   914-945-4469

 
 
 

Loading an executable from within an executable

Post by Jerry Peter » Thu, 15 Jun 2000 04:00:00


This sounds like an MVS facility. Any executable can be loaded into the
address space (process). You get the load address & the entry point address
IIRC. It's very useful for dynamically loading subroutines which may not
be needed for every execution, or executing a system utility, like the
assemble or a compiler on the fly. Another use is to load a table which
can be changed without re-building the entire application. MVS lacks
anything remotely like make so these tend to be important uses.

I think he needs to re-design what he's doing to fit into the shared
library structure of Linux.

        Jerry


> Details...
> Get the address of what, specifically, and how do you expect to use it.
> A shared object has init and termination code which are just normal
> procedures and after init the shared object just sits there waiting for some
> function within it to be called. An executable starts and does not stop
> until it terminates itself. Given the above... what does the system loader
> do when loading an executable via dlopen. I do not know. Does anyone else?
> Calling something not exported is dangerous since the author of the shared
> object can do anything they want without regards to standards beyond the
> public interface to the shared object.
> If the symbols you want is not exported (in the .dynsym section), then if
> the shared object has a symbol table the symbol will most likely be there.
> This means you will have to parse the executable image file, and adjust the
> symbol address retrieved from the symbol table to the loaded address of the
> image loaded after calling dlopen.
> --
> Norman Black
> Stony Brook Software
> the reply, fubar => ix.netcom


>>   Okay, dlopen looked to be pretty helpful, until I discovered that the
>> symbols one can find (via dlsym calls) all have to be exported symbols.
>> The load subroutine call in AIX simply loads the executable and points
>> to the first loaded text address.  The dlopen call seems to load the
>> executable to somewhere, and return a cryptic handle that I cannot figure
> out
>> how to use to get back to something with physical meaning, unless I
>> recompile the object I want to dlopen and export some symbol I'd like
>> to dlsym.  I don't think this is possible as I don't have access to the
>> code to recompile this executable, and I have not found a run-time
> relinking
>> routine in the linux world that would allow me to do this.

>>   Thus, does anyone know how to get either the address at which the
> executable
>> is loaded (from dlopen, I think I can then work around other problems) or
>> to get a non-exported symbol, OR to relink an executable to make something
>> (e.g. main) an exported symbol so dlopen and dlsym will be helpful?

>> Thanks,

>> --
>> --------------------------------------------------------------------------
> ----

>> IBM T. J. Watson Research Center phone: 914-945-2523
>> P.O. Box 218, Yorktown Hts, NY, 10598 fax:   914-945-4469

 
 
 

Loading an executable from within an executable

Post by John-David Wellm » Tue, 20 Jun 2000 04:00:00




>The return type of dlopen() is really Elf32_Ehdr *.
>See  /usr/include/elf.h .

>--


Is this really true?  I tried the following simple program:
#include <stdio.h>
#include <dlfcn.h>
#include <elf.h>
#include <errno.h>

main(int argc, char * argv[])
{
  char * prog_name = "test";

  fprintf(stderr, "%s: Doing the dlopen call\n", argv[0]);
  {
    void * prog_dlptr = dlopen(prog_name, RTLD_NOW);
    fprintf(stderr, "%s: Did the dlopen call : 0x%08x\n", argv[0], prog_dlptr);
    fprintf(stderr, "%s: the dlopen of program %s resulted in errno=%d (\"%s\")\n", argv[0], prog_name, errno, strerror(errno));
  }
  return (0);

Quote:}

And I get the following result (note there is a file called test in the
appropriate directory):
Quote:>./dltest

./dltest: Doing the dlopen call
./dltest: Did the dlopen call : 0x00000000
./dltest: the dlopen of program test resulted in errno=0 ("Success")

From this, I am getting a return value of 0x00000000 (a "NULL" value) but
the errno is set to errno 0 ("Success") so I don't see how this null value
can be a valid Elf32_Ehdr pointer.
--
------------------------------------------------------------------------------

IBM T. J. Watson Research Center                phone: 914-945-2523
P.O. Box 218, Yorktown Hts, NY, 10598           fax:   914-945-4469

 
 
 

Loading an executable from within an executable

Post by Nate Eldredg » Tue, 20 Jun 2000 04:00:00





> >The return type of dlopen() is really Elf32_Ehdr *.
> >See  /usr/include/elf.h .

> >--

> Is this really true?  I tried the following simple program:
[snip]
> And I get the following result (note there is a file called test in the
> appropriate directory):
> >./dltest
> ./dltest: Doing the dlopen call
> ./dltest: Did the dlopen call : 0x00000000
> ./dltest: the dlopen of program test resulted in errno=0 ("Success")

> From this, I am getting a return value of 0x00000000 (a "NULL" value) but
> the errno is set to errno 0 ("Success") so I don't see how this null value
> can be a valid Elf32_Ehdr pointer.

Mm, no, I think something is wrong.  dlopen returns NULL on failure,
so you didn't really get a handle.

errno is not an appropriate way to check for errors here, I believe.
The man page says that errors from dl* functions should be retrieved
with `dlerror', and says nothing about errno.

--

Nate Eldredge

 
 
 

Loading an executable from within an executable

Post by John-David Wellm » Wed, 28 Jun 2000 04:00:00







>> >The return type of dlopen() is really Elf32_Ehdr *.
>> >See  /usr/include/elf.h .

>> Is this really true?  I tried the following simple program:
>[snip]

>Mm, no, I think something is wrong.  dlopen returns NULL on failure,
>so you didn't really get a handle.

>errno is not an appropriate way to check for errors here, I believe.
>The man page says that errors from dl* functions should be retrieved
>with `dlerror', and says nothing about errno.

You are completely correct, I added the dlerror call, and I get an
appropriate error message.  The problem seems to still be that the
dlopen will only open a shared module, i.e. it will not open an executable
program file.  I'd really like to open/load an executable object file
(in ELF format no less) and be returned the address of the main
routine (or even _start) for that program.  I would like to be able
to do this without altering the actual executable in any way, and
without access to the executable's source code.  I would be willing to
run-time link in some additional code, but that is about the limit
of modifications I can allow.  Does anyone have further suggestions as
to how one might do this?

--
------------------------------------------------------------------------------

IBM T. J. Watson Research Center                phone: 914-945-2523
P.O. Box 218, Yorktown Hts, NY, 10598           fax:   914-945-4469

 
 
 

Loading an executable from within an executable

Post by John Reise » Wed, 28 Jun 2000 04:00:00


Quote:> This would appear problematic, as executable files are not relocateable,
> and the address to which they'd need to be loaded is already occupied
> by your application.  (Well, you could move your app out of the way.)
> I don't think there's any offical way to do that.

The official way is explained in the documentation for Linker Scripts:
http://www.cygnus.com/pubs/gnupro/5_ut/b_Usingld/ldLinker_scripts.html
and also there are shorthands -Ttext -Tdata -Tbss for /bin/ld.
See the manual page.

Quote:

> What do you actually want to achieve?  Does the code need to be
> actually executable, or do you just want to access it?  In the
> latter case, you could use the BFD library from the GNU binutils
> package, which contains various routines for accessing and
> manipulating ELF files ...

YMMV.  I found that it was easier and sooner-to-completion to use
<elf.h>.  BFD is only a partial interface to ELF, and I could not
find a good explanation of the degree of partiality and the
assumptions made by BFD.

--

 
 
 

Loading an executable from within an executable

Post by Ulrich Weiga » Thu, 29 Jun 2000 04:00:00



>You are completely correct, I added the dlerror call, and I get an
>appropriate error message.  The problem seems to still be that the
>dlopen will only open a shared module, i.e. it will not open an executable
>program file.  I'd really like to open/load an executable object file
>(in ELF format no less) and be returned the address of the main
>routine (or even _start) for that program.  I would like to be able
>to do this without altering the actual executable in any way, and
>without access to the executable's source code.  I would be willing to
>run-time link in some additional code, but that is about the limit
>of modifications I can allow.  Does anyone have further suggestions as
>to how one might do this?

This would appear problematic, as executable files are not relocateable,
and the address to which they'd need to be loaded is already occupied
by your application.  (Well, you could move your app out of the way.)
I don't think there's any offical way to do that.

What do you actually want to achieve?  Does the code need to be
actually executable, or do you just want to access it?  In the
latter case, you could use the BFD library from the GNU binutils
package, which contains various routines for accessing and
manipulating ELF files ...

--
  Dr. Ulrich Weigand

 
 
 

Loading an executable from within an executable

Post by Jan Pantelt » Thu, 29 Jun 2000 04:00:00


Quote:

>This would appear problematic, as executable files are not relocateable,

Code is relocatable, there is no fixed execution address at all, like for
example the 100H in CP/M, or whatever?

Quote:>and the address to which they'd need to be loaded is already occupied
>by your application.  (Well, you could move your app out of the way.)

What 'fixed' address are you talking about?
the code can reside anywhere in RAM (note page position ,alignment etc).
Remember we refer to virtual addresses / ,memory management here.

Quote:>I don't think there's any offical way to do that.

I think you are wrong, Dr. in what?
Jan
 
 
 

Loading an executable from within an executable

Post by Norman Blac » Thu, 29 Jun 2000 04:00:00


Let me preface this by saying I have ported our linker to Elf/Linux for our
Modula-2 development system.

Executables are not relocatable in Elf/Linux. The linker fixes up the image
for a given address, on Linux this defaults to 0x8048000. The executable
must be loaded at this address or it cannot run. This is because absolute
references, like global data, are referring to the fixed up loaded address.
Procedure calls a relative and position independent once linked. Shared
objects are fixed up at base address 0, but they contain fixups in the
executable image so that the runtime dynamic linker can load the image at
any address. Shared objects must be based at zero because Elf only has
R_386_RELATIVE fixups which only works if the base address is zero. This
fixup is used only in shared objects on things like data references.
Basically an R_386_32 gets partially fixed up and then output as
R_386_RELATIVE.

There is the compiler option of generating position independent code for
shared objects, but given how the Gnu/Linux dynamic linker operates this is
not necessary if the proper fixups are output into the shared object. This
allows for much for efficient code generation since position independent
code is larger and slower than "normal" code. Programs in Linux are probably
never compiled for position independent code.

Regarding the loading of a program image into the memory space of an
executing process.

1. If you have no control over the program to be loaded you are screwed.

2. If possible link the program with relocation information. I am not sure
if the Gnu linker will do this for programs. The loader should be able to
relocate the program with this information. However the loader may just
assume programs can not be relocated and give up.

3. Even if you can load the executable you will need to have the appropriate
symbols "exported" from the image. I am not sure if the Gnu linker will do
this for programs verses shared objects. For shared objects it "exports" all
public symbols in the image. The dynamic loader uses the dynamic symbol
table in the image to get symbol addresses.

--
Norman Black
Stony Brook Software
the reply, fubar => ix.netcom


Quote:

> >This would appear problematic, as executable files are not relocateable,
> Code is relocatable, there is no fixed execution address at all, like for
> example the 100H in CP/M, or whatever?

> >and the address to which they'd need to be loaded is already occupied
> >by your application.  (Well, you could move your app out of the way.)
> What 'fixed' address are you talking about?
> the code can reside anywhere in RAM (note page position ,alignment etc).
> Remember we refer to virtual addresses / ,memory management here.

> >I don't think there's any offical way to do that.
> I think you are wrong, Dr. in what?
> Jan

 
 
 

Loading an executable from within an executable

Post by Ulrich Weiga » Fri, 30 Jun 2000 04:00:00



>> This would appear problematic, as executable files are not relocateable,
>> and the address to which they'd need to be loaded is already occupied
>> by your application.  (Well, you could move your app out of the way.)
>> I don't think there's any offical way to do that.
>The official way is explained in the documentation for Linker Scripts:
>http://www.cygnus.com/pubs/gnupro/5_ut/b_Usingld/ldLinker_scripts.html
>and also there are shorthands -Ttext -Tdata -Tbss for /bin/ld.
>See the manual page.

I see that my last statement was a little ambiguous ;-)

Of course it is simple to link your executable to occupy different
virtual addresses.  This is not what I meant.

I was trying to say that I don't believe that there is an offical way
(as easy as, say, dlopen()) to dynamically load a non-relocateable
executable.  The reason for there not being such a mechanism is
probably that in general it won't be particularly useful, due to
address conflicts.

--
  Dr. Ulrich Weigand

 
 
 

1. Executable is not executable

Hello guys,

I am having a problem with a executable I made, that the system refuses to, ehm,
execute. And it's not the simple 'execute bit not set'.

I build all my libraries in two versions, static (.a) and dynamic (.so).
The version of my executable that links the .so version works without a glitch.
When I build the statically linked version, the result is as follows:

pluto_hagen% ls -al ./TestProg
-rwxr-xr-x    1 hagen    users    11102630 Apr 29 09:12 ./TestProg

pluto_hagen% file ./TestProg
./TestProg: ELF 64-bit LSB executable, Alpha (unofficial), version 1,
dynamically linked (uses shared libs), not stripped

So far, so good. Looks like TestProg should work.

pluto_hagen% ./TestProg
./TestProg: Command not found.

Oops. Further examination of the executable yields:

pluto_hagen% ldd ./TestProg
/usr/bin/ldd: ./TestProg: No such file or directory

pluto_hagen% strace ./TestProg
execve("./TestProg", ["./TestProg"], [/* 60 vars */]) = 0
strace: exec: No such file or directory

pluto_hagen% gdb ./TestProg
GNU gdb Red Hat Linux (5.2-2)
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "alpha-redhat-linux"...
(gdb) run
Starting program: /tmp/TestProg
~hagen/.cshrc: running!
/tmp/TestProg: Command not found.

Program exited with code 01.
You can't do that without a process to debug.
(gdb) quit

pluto_hagen% nm ./TestProg
000000012027b140 r a1
000000012027b148 r a2
0000000120272990 r aa
0000000120301e20 d aatEchoPara
00000001201be590 T abort
0000000000000000 a *ABS*
0000000000000000 a *ABS*
...
lots and lots more symbols, no problem.

So, I am stumped. nm seems to be the only tool that likes my executable. Can
anybody give me a hint how to examine this further? Or even had this problem
before, and solved it?

Any help very much appreciated.

Ulrich

2. How to log only a specific level with syslogd ?

3. Executable Binary File vs. Executable Script File

4. UNIX

5. How to get the full path of an executable program from within that program itself

6. Help with Minicom

7. How to find fullpath of executable from within code?

8. syskonnect gigabit ethernet cards

9. How to get the full path of an executable from within it

10. Library load by an executable

11. loading two sets of libraries in the same executable for regression testing ?

12. Boot problems - File loaded is not executable.

13. Extendable programs and loading executable sections