Shared Lib Question (ISC)

Sun May 19 06:16:26 AEST 1991

In article <7611 at segue.segue.com> jim at segue.segue.com (Jim Balter) writes:
>As I've already pointed out, this requires that every library routine that ever
>might call [a routine that ever might call ...] (I meant to imply this closure,
>Dan) malloc must also be a wrapper that passes the malloc arena pointer to the
>real routine, which must in turn pass the arena pointer to any other real
>routine that ever might call [a real routine that ever might call ...] real
>malloc.  Whether this is "implemented well" is certainly a matter of opinion.
>Passing this arena pointer around violates all reasonable coupling rules.
>Better to pass a pointer to a structure containing all global data,
>in a hidden register if possible, else as the first or last (by convention)
>arg to every routine.

No, it only requires that every routine which ever may call malloc, either
directly or indirectly, to use the externally visible interface.

Any routine which directly calls malloc invokes malloc via its external
interface at a well known address.  All the normal rules for binding the
executable insure that the external interface is bound - it is the only
symbol that can satisfy the undefined reference to malloc.  If, for example,
I invoke some stdio routine that gets a buffer from malloc, I will collect
the undefined symbols from that module and attempt to resolve them.  I find
malloc to be one of those symbols, and I load the static part.  If my
routine calls a routine that calls a routine ... I eventually find malloc
as an undefined symbol, and load the static part.

By placing the address of certain routines in a jump-table that resides
at a well-known address, it is possible to find the statically bound
malloc wrapper.  Internal routines do not need have the arena handle
because they have the external interface address, and the external
interface is the only routine that needs to know about the handle.

>Better to have a wrapper for every function and pass one pointer
>for every function than to try to guess ahead of time which functions
>might lead to a call to malloc and which functions might lead to
>a reference to _iob.  Of course, this requires a single structure
>definition that contains the malloc arena as well as _iob and any
>other globals that might be needed, which is grossly bad coupling,
>although you could build the structure up cleanly from pieces in
>various modules, avoiding the need to couple all this disparate info.

Sure, if that were the case, it would make sense to pass a pointer
to the global shared data.  However, it isn't the case - it is
possible to determine the name of every routine that is invoked, and
to bind the required parts.

Here is a construction proof that malloc can be implemented in the
fashion I describe -

	1). It is possible to code a function which is pure
	    text and has no external data references (simple,
	    we all agree you can have a pure-code shared library
	    routine) and have it referenced by non-library
	    code.
	2). It is possible to code a function which is impure
	    and unshared.  (again pretty simple - we do it
	    all the time.)
	3). It is possible to have a shared library function
	    invoke unshared code (we publish the address of
	    the function at some well-known address, so no
	    linkage is required)
	4). Define malloc() to be a unshared library routine
	    which contains private data [ Just a definition,
	    and permitted by 2) and 3) ] and is invoked by
	    an unknown number of shared library functions.
	5). Implement malloc() so that it invokes shared
	    pure-text library functions and passes locally
	    defined static variables as arguments [ permitted
	    by 1) and 2) - that is, I can make a shared library
	    routine, and now I'm going to invoke it. ]
	6). Define the shared library function which our
	    unshared malloc() invokes to perform the
	    operations which malloc() traditionally performs,
	    using a passed structure pointer as its argument.
	    [ another definition - our pure-code malloc'()
	    can do whatever it wants with the contents of
	    the structure ].

By induction every other library function can be implemented in the
same manner.  Indeed, malloc'() probably needs to invoke a function
which is a wrapper about the sbrk() or brk() system calls since they
traditionally keep the end of the break as an variable and one or
the other calls the system after some manipulation has been performed
on the user's argument and the current break variable value.

>>[ Hint: How does the system get errno out of the kernel and
>>  into the user space, if it is a user space global variable? ]
>
>Here's where DMR's bad CS becomes evident; system call interfaces should
>have taken a pointer through which to store the error number (or an error
>structure, for more detailed info), instead of the global errno hack.

That then is your answer.  However Ritchie did it, I'll do it too ;-)
-- 
John F. Haugh II        | Distribution to  | UUCP: ...!cs.utexas.edu!rpp386!jfh
Ma Bell: (512) 255-8251 | GEnie PROHIBITED :-) |  Domain: jfh at rpp386.cactus.org
"If liberals interpreted the 2nd Amendment the same way they interpret the
 rest of the Constitution, gun ownership would be mandatory."