Delay-loaded DSO cannot read global data

Delay-loaded DSO cannot read global data

Post by Peter S. Shenki » Sat, 22 May 1999 04:00:00



Hi,

I've been looking for the answer to this in the manuals.  I have a
mixed Fortran/C DSO that I'm trying to delay_load.  It needs to read
from a COMMON block that is initialized in the main program;  however,
it is getting values of 0 for all the COMMON data.  When I don't
delay_load it, it's fine.

The problem is exhibited by the following small pure Fortran example.  I
show main.f (a main program), bar.f (which will become the dso) and a
Makefile that makes four targets:

        nolib:  just links main.o and bar.o
        lib:    links main.o and libbar.a (an archive)
        dso:    links main.o and libbar.so (a dso)
        delay:  links main with -delay_load libbar.so

All except delay print "int =  3", as expected.  delay prints "int =  0".

Any thoughts?

Thanks.

main.f:
PROGRAM foo
  IMPLICIT NONE
  COMMON/foobar/int
  INTEGER int

  int = 3

  CALL bar
END

bar.f:
SUBROUTINE BAR
  IMPLICIT NONE
  INTEGER int
  COMMON/foobar/int

  WRITE( 6, '(A,I3)' )'int =', int
END

Makefile:
FFLAGS = -g -freeform -n32
all: nolib lib dso delay
main.o: main.f
        f90 -c $(FFLAGS) main.f
bar.o: bar.f
        f90 -c $(FFLAGS) bar.f
libbar.a: bar.o
        ar -rs libbar.a bar.o
libbar.so: libbar.a
        -rm so_locations
        ld -shared -all libbar.a -o libbar.so
nolib: main.o bar.o
        f90 -o nolib $(FFLAGS) main.o bar.o
lib: main.o libbar.a
        f90 -o lib $(FFLAGS) main.o libbar.a
dso: main.o libbar.so
        f90 -o dso $(FFLAGS) main.o libbar.so
delay: main.o libbar.so
        f90 -o delay $(FFLAGS) main.o -delay_load libbar.so

 
 
 

Delay-loaded DSO cannot read global data

Post by David B Anders » Sat, 22 May 1999 04:00:00




Quote:>Hi,

>I've been looking for the answer to this in the manuals.  I have a
>mixed Fortran/C DSO that I'm trying to delay_load.  It needs to read
>from a COMMON block that is initialized in the main program;  however,
>it is getting values of 0 for all the COMMON data.  When I don't
>delay_load it, it's fine.

>The problem is exhibited by the following small pure Fortran example.  I
>show main.f (a main program), bar.f (which will become the dso) and a
>Makefile that makes four targets:

>        nolib:  just links main.o and bar.o
>        lib:    links main.o and libbar.a (an archive)
>        dso:    links main.o and libbar.so (a dso)
>        delay:  links main with -delay_load libbar.so

>All except delay print "int =  3", as expected.  delay prints "int =  0".

>Any thoughts?

[excellent f90 test case removed to save space]

The wrong things happen.
(I tried this with 7.2.1 and with 7.3 compiler/linker).

9:01AM> elfdump -Dt -long dso delay simple *.so |grep foobar
[71]     0x4                4        STT_OBJECT  STB_GLOBAL       STO_DEFAULT         SHN_COMMON          foobar_ (0xb9)
[71]     0x4                4        STT_OBJECT  STB_GLOBAL       STO_DEFAULT         SHN_COMMON          foobar_ (0xb9)
[78]     0x10014094         4        STT_OBJECT  STB_GLOBAL       STO_DEFAULT         SHN_MIPS_ACOMMON    foobar_ (0xb9)
[28]     0x5fff40b8         4        STT_OBJECT  STB_GLOBAL       STO_DEFAULT         SHN_MIPS_ACOMMON    foobar_ (0x1)

('simple' is a main.o that does not call bar or link against the DSO).

What the above shows is that libbar.so foobar_ gets the
        SHN_MIPS_ACOMMON marking

But that only works properly when libbar.so is immediately loaded,
not when libbar.so is delay loaded.

What happens at run time in the erroneous case:

  The run-time linker, rld, sees the SHN_COMMON for foobar_ in
  'delay' and allocates, at run time, space for it.  Note that,
  (for reasons I did not track down) the de* gets a
  completely wrong address for the common variable (which is
  easier to see if you change the test case to make the common
  variable  'intx' rather than 'int' as 'int' is a keyword to dbx
  (because of C)).   I changed 'int' to 'intx' in my version of
  the test case.

  When libbar.so is delay-loaded, rld sees the SHN_MIPS_ACOMMON
  and uses the linker-allocated space for it (!), to load it at a
  different location.

  Now you have two different locations for foobar_, and nothing
  works correctly.

You can watch the addresses be assigned by the run time linker
by running delay with the following script:
export LD_LIBRARY_PATH="."
export _RLDN32_PATH=/usr/lib32/rld.debug
export _RLD_ARGS="-v -yfoobar_"
./delay

It appears we have two bugs here:

a) ld does the wrong thing by making the delay-loaded DSO common
        SHN_MIPS_ACOMMON.

b) dbx gets the wrong address for 'intx' in foo.

(I will file bugs on the above issues).

It is possible that there is some documented restriction
applicable to the test case and I will enquire about that with
the relevant folks here (though I suspect the above are just
bugs).

Getting delay-load right (in the linker and rld) (for a
reasonable definition of 'right') has proven to be surprisingly
difficult.

Sorry about this.