Noalias trivia question

Larry Jones scjones at sdrc.UUCP
Sun May 1 02:44:49 AEST 1988


In article <1164 at maccs.UUCP>, gordan at maccs.UUCP (gordan) writes:
> In all the time noalias was discussed on comp.lang.c, just about no one
> spoke up in favor of it.  In the recent X3J11 meeting it was reportedly
> voted down by well over the required (2/3 ?) majority.
> 
> And yet, it must have once gotten a similar majority in favor, not so
> very long ago (January?).

December.

> So, like, who on X3J11 will admit to having voted for (or worse,
> proposed) noalias in the first place?  Why did it seem like a good idea
> at the time, and what were the reasons for dropping it (apart from
> universal opprobrium).  Enquiring minds &c.

Well, being one of those fools who rush in where angels fear to tread, I will
admit to having voted for noalias (before you start sending letter bombs, I
also voted to remove it).

How did noalias come to be?  One of the committee's goals has been to define
a language that will be broadly usefull.  This has lead to all kinds of
conflicting demands that have to be reconciled somehow.  For instance, people
using C for general programming want the compilers to optimize the heck out
of everything so their programs run as fast as possible.  People writing
system software like memory mapped device drivers want no optimization at
all.  These conflicting demands eventually lead to the "volatile" keyword
which turned out to be a double win -- the general types get the optimization
they want and the system types get the control they want plus they get all of
the optimizations for the parts of the code that aren't accessing hardware.

One of the issues relating to optimization that the committee has wrestled
with for a long time is the problem of aliasing.  When you're trying to
optimize generated code, it is, of course, necessary to make sure that the
resulting code still gets the "right answer".  To do that, you need to be
very careful to keep track of exactly how and where the value of an expression
can be changed.  This is particularly important when you have some type of
parallel processing since having multiple CPUs makes the benefits of changing
the order in which things occur much greater since many things can be going
on at the same time.

Unfortunately, keeping track of what's going on in a C program is next to
impossible.  Most of the problems are caused by pointers -- although some
pointers point are the only way to get to the object they point to, most
pointers are aliases for objects that can be accessed in other ways as well.
By doing global flow analysis of the entire program it is (theoretically)
possible to figure out all the things that a pointer can possibly point to
but this is extremely difficult.  It is impossible given that C allows
separate compilation since this means that the compiler can't look at the
whole program.  Traditionally, this has been dealt with by assuming that
anything that has had its address taken or could have had its address
taken (e.g. globals) is changed any time something is changed through a
pointer.  Similarly, changing any value through a pointer is assumed to
change the values pointed to by all pointers.  Although these pessimistic
assumptions are quite safe, they prevent most optimizations.  For instance,
most users of vector processors are surprised that the following function:

	void addvec(double a[], double b[], double c[], int n) {
	   int i;

	   for (i = 0; i < n; i++)
	      a[i] = b[i] + c[i];
	   }

does not result in a single vector addition (especially since writing the
"same" function in FORTRAN does).  Actually, "angry" is probably a better
term that "surprised" in this case.  The reason for this is, of course, that
in C there is nothing wrong with calling this function like:

	addvec(a+1, a, a+1, sizeof a / sizeof a[0] - 1);

which sums the vector into its last element.  This is invalid in FORTRAN
which is why the FORTRAN version can vectorize but the C version can't.

This turns out to be a major issue for every vendor with parallel hardware.
Lest you think that this is a fairly esoteric group, just remember that every
IBM PC with a math coprocessor is a parallel machine (although existing 
software doesn't make much use of the capability yet).  Also note that this
is not a specsmanship battle either -- users buy parallel machines to get
performance and they get very hostile when they don't get it.  Minor details
like their language of choice not being suited to vectorization are not of
interest.

This is the atmosphere that existed at the December meeting.  Many of the
people with compilers for parallel machines were saying that the lack of a
method for specifying aliasing was sufficient reason for a "no" vote on
the standard and proposed a number of possible solutions.  One was to change
the meaning of parameters declared as arrays to be pointers to things that
don't overlap each other.  This was accepted fairly well except that there
was concern that it might break existing code which uses array and pointer
notation interchangably.  A possible solution to this objection was to only
assign the new meaning in prototypes, but that introduced an incompatibility
between declarations in and out of prototypes that many people objected to.
Other interesting variations on syntax were introduced and rejected for
various reasons.  Also, it was pointed out that this only solved the problem
with function arguments, it did not solve the general problem.

None of the proposals got the necessary 2/3 vote for adoption, but a clear
majority of the committee was interested in finding a solution to the problem.
Finally, the people who were most interested in the issue agreed to form a
working group to try to concoct a solution which would be acceptable to the
majority of the committee.  On Friday, the working group presented the
solution they had arrived at -- "noalias".  What was presented was only the
concept, the exact wording still had to be worked out.  It seemed to be
everything anyone could want: the default was that everything was aliased so
no existing code would be broken, you could specify it on parameters to get
the non-overlapping parameters (which all the library functions except
memmove required - now we could specify the restriction in the prototypes
instead of having a disclaimer in English!), and you could use it on pointers
and objects just like "const" and "volatile" to solve the general problem.
What more could anyone want?

Well, there were a number of concerns, of course.  We hadn't much time to
look at it and think about the possible consequences.  We were trying to get
the draft out for the second (and last) public review.  We didn't have the
exact wording.  Interestingly, noone brought up the problem that finally
caused its downfall - it was too hard for mere mortals to get right;
particularly in large programs and during maintenance.  Guess we were all
wearing our wizard hats when we should have been wearing our manager hats.
Finally, since we had all agreed that we wanted to solve the problem and
this looked like a good solution, we agreed to add it to the draft.  If it
really was a good solution (and it seemed like it was) then it would have
been reviewed and, even if the words weren't quite right, we could just
change them editorially without having to have another public review.  Of
course, having all those other people review it rather that just us would
point out any problems we may have missed (did I just hear some ominous
music in the background?).

After that, you know what happened.  It got mentioned here and in other forums
as well.  The reactionaries immediately flamed it, the rational expressed
concern but reserved judgement pending the exact wording.  The exact wording
came and was judged incomprehensible.  The rational, having carefully
considered all the ramifications, denounced it.  Dennis Ritchie, father of C,
denounced it.  Henry Spencer put Dennins's denouncement into his signature.
Many people made quite persuasive, rational, arguments that it was not
"a good thing".  April came and the committee, burnt, blackened,
covered with soot and smelling of smoke, reconvened.  Although only a very
few of the committee members are foolish enough to post things here and
expose themselves to the direct ire of the net, many of them do read it.
The rational arguments won.  Although Dennis showed up in person for the
first time in the history of the committee, most of the members had already
made up their minds and noalias was summarily removed.

There was a brief flurry of activity to try to fall back to one of the other
proposals (most notably the array arguments may not overlap proposal), the
committee was so gun-shy that even an attempt to simply declare array args
an obsolescent feature so it could be changed in a future standard was
defeated.

And that's the way it was (at least that's the way I remember it, which may
or may not have much to do with reality).

----
Larry Jones                         UUCP: ...!sdrc!scjones
SDRC                                AT&T: (513) 576-2070
2000 Eastman Dr.                    BIX:  ltl
Milford, OH  45150
"When all else fails, read the directions."



More information about the Comp.lang.c mailing list