shared libraries can be done right

Kristoffer Eriksson ske at pkmab.se
Sat Jun 1 16:41:50 AEST 1991


In article <18370001 at hpfcso.FC.HP.COM> mjs at hpfcso.FC.HP.COM (Marc Sabatella) writes:
> This assumes, as you explicitly stated, that
>the resolutions will be the same for each program.  Unfortunately this cannot
>be guaranteed.  The "malloc" example brought up by several people in response
>to Alex's claim that shared libraries should be "simple and elgant"
>demonstrates this well.  A library may make calls to malloc(), but different
>programs may provide their own definitions of malloc(), and the library's
>references would have to be resolved differently for each.  Some means must be
>provided for this.

That's fairly simple, if you add a level of indirection. "Any problem can
be solved by adding another level of indirection."

I think most shared libraries have the need for a data segment that is
instantiated for each program that links to it, to hold such data that
would otherwise have been static or global in an ordinary library, anyway.
Providing such a segment is no problem; it can be automatically allocated
when the shared library is loaded, by the system or by the application
itself, or can even be statically linked into the program binary. The
simplest way thereafter to inform the shared library of where its data
segment is, is to pass the address of the segment as an additional
parameter in every call to the library. Ideally, the compiler should do
that automatically and transparently for all calls to shared libraries,
and also automatically address all static and global variables in a
library off of that additional parameter when it knows it is compiling
a shared library. Anyway, no matter whether the compiler helps out with
this or not, it is easy to implement, and there are other possibilities
too, like using statically linked wrappers for all library routines, or
adding a special memory page on a fixed offset from the library code
pages in the applications address space to hold such data, and address it
off the PC or the library's call address, or if the library is given the
same address in all programs linking to it, simply address that page with
fixed addresses, same for all.

Passing in pointers to global data in various ways has already been mention
previously in this discussion thread, but it has mostly centered around
giving malloc a pointer to malloc data, giving stdio a pointer to stdio
data, and so on, giving rise to a more or less pronounced need to pass
all these various pointers in to *every* library function that might
conceivably somewhere down the chain call these functions. For instance,
most stdio functions might very probably need the malloc pointer in order
to dynamically allocate I/O-buffers, in addition to it's own stdio pointer,
and who knows what more? The solution is to just have one pointer and one
data segment for each shared library, which gives you access to the data
needed by *all* the functions and packages that reside in the library. This
one pointer is readily available to all callers of the library's function,
since the pointer itself can simply be stored in an arbitrary global (to
the application) variable. This is actually how it is done on the Amiga
(which once again do things the right way). Additionally, the Amiga reserves
a special register for this pointer.

Having this data segment, I'd say that not only should the library's data
reside there, but also all calls out of the library back to the application
or to other libraries should indirect through this page. Then you can easily
patch in the correct address to whatever malloc() function the library should
use in this particular program during the dynamic linking phase, just as you
resolve all calls *to* the library. You also have to store library data
segment pointers here for the other libraries that this library makes calls
to, since the library, unlike the application, can not get them from the
applications global variables.

An exception might be calls from the library that are already required by
the library to go to a specific other library, rather than to be resolved
according to the application program's preferences. Those can be resolved
once and for all, if all shared libraries stay at the same addresses in all
programs that reference them. On the other hand, if you really want to
simplify everything, you might consider limiting yourself to *only* use this
last kind of calls out of shared libraries, and scrapping all dynamic call
resolution. You might ask yourself if it really is essential to be able to
influence where the library's own calls go. You could view the library as
a fixed package, including everything it itself calls.

> Were it not for the desire to allow this sort of
>interposition, shared libraries would be a great deal simpler than they are.
>This is also why a segmented architecture is no panacea, and why position
>independent code needs to have some indirection in it to be useful.

I don't think just a little bit of indirection makes it so much more
complicated, and it's not just the position independence that causes this,
you will have it even using fixed addresses, if you want calls out of a
shared library to refer to different places depending on the program
linking to it ("reference independance"?).

-- 
Kristoffer Eriksson, Peridot Konsult AB, Hagagatan 6, S-703 40 Oerebro, Sweden
Phone: +46 19-13 03 60  !  e-mail: ske at pkmab.se
Fax:   +46 19-11 51 03  !  or ...!{uunet,mcsun}!sunic.sunet.se!kullmar!pkmab!ske



More information about the Comp.unix.internals mailing list