Problems with GCC and/or VAX LINK

Steve Summit scs at adam.pika.mit.edu
Thu Mar 16 16:35:43 AEST 1989


Quotes and comments have been discussed to death, but I haven't
seen any discussion of the globalref issue.  (Perhaps it was
confined to gnu.gcc or comp.os.vms, where I wouldn't have seen
it.  Apologies for any redundancy.)

In article <1680 at levels.sait.edu.au> ccdn at levels.sait.edu.au (DAVID NEWALL) writes:
>I recently tried compiling the VAX LZCMP compression program on our
>VAX.  The VAX is running VMS 5.0 and the C compiler is GNU C (don't
>know what version).  I encountered a few problems:
>
>2.  The LZCMP program includes a DCL command table, which is...
>    referenced (in the program) via a
>    "globalref" variable; the declaration looks like this:
>        globalref dcl_table;    /* this is the DCL command table */
>    I assumed that "globalref" meant the same as "extern".  This turns
>    out not to be the case.  It seems that VMS has a "global" class for
>    symbols, and that extern variables aren't "global".  It also turns
>    out that extern functions _are_ global -- what I am saying is that
>    "extern dcl_table" didn't work (dcl_table didn't point to the right
>    place), but "extern dcl_table()" did!

External linkage from C under VMS is a real can of worms.  Given
some fixed historical precedents, the globalref/globaldef stuff
(and the resulting workarounds under compilers which don't have
it) is probably necessary, albeit messy and nonstandard.

First, you need to know that VMS object files, and the VMS
linker, deal with multiple Program SECTionS, or psects.  Unix
uses two or three analogous segments ("text," "data," and "bss"),
but VMS typically deals with many.  Psects have a number of
attributes: whether they're executable, whether they're writable,
what their alignment is, and -- here's an interesting one --
whether multiple psects of the same name, contributed from
different object files, concatenate or overlay each other.
Psects that overlay each other might sound useless at best or
dangerous at worst, but it turns out they're just what you want
for, say, Fortran COMMON blocks.

When you say

	extern int x;
or
	int x = 3;

under DEC's VMS C compiler, you don't get a conventional defined
or undefined global symbol in a data psect.  You get a new psect,
named "X", of size sizeof(int), marked with the overlay
attribute.  All global variables named "x" in all modules
therefore end up sharing the same storage, as expected.  I don't
know for certain why this somewhat unusual and unexpected
implementation of C global variables was chosen, but I suspect it
had to do either with

     1.	An attempt to maintain compatibility with one of the
	weaker models for C external linkage, suggested but not
	required by K&R, but which many existing C programs
	assume.  (The "strong" model is "exactly one defining
	instance;" i.e. all but one declaration of a global
	variable must use the word "extern."  Unfortunately,
	various "common" models are, er, common, such as
	programs that say

		int x = 3;

	in one module and

		int x;

	in another.)

     2.	An attempt to make linking of C and Fortran modules easy,
	by mapping C externs to Fortran COMMON blocks.

Given the "common psect" implementation for conventional C
externs (and, for better or worse, that is the implementation),
if what you want is a regular defined global symbol in a data
psect, you've got to use globaldef (or globalref to reference
it), for that is exactly what globaldef and globalref do.

I doubt it would be easy to add these to gcc, since they show up
in the grammar.  gcc probably had to go with the common psect
model for regular externs for compatibility with VAX11C.

The reason that

	extern dcl_table();

worked is that functions do deal with conventional defined
symbols (as opposed to named psects).  Since all you did with
dcl_table was (I presume) pass its address back to the CLI
routines, the C compiler never had to generate any code other
than to push the address, so the fact that it was (incorrectly)
declared as a function didn't matter.  This is a nice workaround,
which I hadn't seen before (Did you invent it?  Congratulations!)
and it is probably the correct thing to use.

>3.  My investigations into "globalref" high-lighted a problem with either
>    the VMS linker, or with both GCC and VAX C.  Essentialy, I can compile,
>    link and execute the following program:
>        extern v1;
>        int v2;
>    ...
>                printf("&v1=%d\n&v2=%d\n", &v1, &v2);
>
>    Compiling with GCC, I get &v1 == &v2.  Compiling with VAX C I get
>    &v1 + 4 == &v2.  In either case, I think it's wrong.  I think that
>    I should get a linker error complaining about an undefined external
>    variable (v1).

VAX C worked because of the way overlayed psects work -- each
declaration of (in this case) v1, whether a "defining instance"
or not, generates a reference to a psect named "V1", so even if
there never is a defining instance, the psect gets created.  This
is mildly surprising, but no more so than some of the screwball
things the Unix compilers and linkers have always let you get
away with.  (The other day I discovered that Ritchie's pdp11
compiler accepts

	extern int x = 3;

although I don't know what it means.)

gcc probably ended up with &v1==&v2 because of a misunderstanding
or bug in its implementation of the named psect nonsense.

The big problem with implementing C externs as named psects is
that the linker won't then search for undefined externals (if it
did, the "expected" error for an undefined v1 would have resulted
from the above example).  Instead, undefined externals spring
into existence, as noted, without (here is the killer) being
loaded from libraries.  (This issue would have qualified for a
"frequently asked questions" list on comp.os.vms when last I
followed it.)  That is, if you have

	extern int x;

in an explicitly-loaded object file, and an object in a library
containing only

	int x = 3;

that library member won't get loaded, and x will remain 0.  The
solutions are either to request the library member explicitly, or
to use globalref/globaldef, or to add to the library member a
definition of some other required symbol (such as a function,
which links conventionally) to force the member to be loaded.

At one point I heard that a future version of the VMS linker
would be able to search for psects, perhaps to solve this
problem; that may have been implemented by now.

If you think globalref and globaldef are weird, have you looked
at globalvalue?  A totally unfamiliar concept to C programmers,
though useful in a VMS environment.  If you say

	globalvalue int SS$_NORMAL;
	int retval = SS$_NORMAL;

you'll end up with something like

	movl $1, _retval

rather than

	.extern _SS$_NORMAL
	movl _SS$_NORMAL, _retval	; no $, no immediate constant,
					; SS$_NORMAL is here an address

That is, compiler will generate code not to dereference a
location whose address the linker will fill in, but to use an
absolute value (which the linker will fill in).  Under Unix
predefined magic constants are typically implemented with
#defines in standard header files; under VMS the linker will fill
them in from the standard libraries.  It turns out that you can
simulate globalvalue with the same kind of trick as for globalref --
you could say

	extern int SS$_NORMAL();
	int x = SS$_NORMAL;

and presto (ignoring type clash warnings) x would be set to 1.
(This does not mean that globalref and globalvalue are equivalent
and therefore redundant; the globalref workaround replaced
something like

	globalref dcl_table;
	cli$xxx(..., &dcl_table, ...);

with

	extern dcl_table();
	cli$xxx(..., dcl_table, ...);

Note the ampersand; globalref and globalvalue differ by a level
of indirection.)

                                            Steve Summit
                                            scs at adam.pika.mit.edu



More information about the Comp.lang.c mailing list