C'mon, guys! (Really, pointer pedagogy)

Chris Torek chris at umcp-cs.UUCP
Sat Jun 21 08:35:25 AEST 1986


[Warning: this is not an article about `C', but rather an article
about `about C'.  Nothing truly technical is contained herein.]

In article <748 at eneevax.UUCP> phaedrus at eneevax.UUCP (Praveen Kumar) writes:
>I believe that a lot of the notation in C is derived from PDP assembly
>language.  I think (it has been long time since I mucked around with
>PDPs) that the increment, "++", and the dereferencing, "*" operators are
>straight out of PDP assembly.

This is not really for me to say, for I was not in on the creation
of the C language, yet I feel I should answer this.  (If I do a
good enough job, perhaps I can even provoke DMR into a few minor
corrections. :-) )  Was the C notation derived from PDP-11 assembly?
I think the answer here is both no and yes.  Much C notation was
certainly influenced by '11 assembly; but I think `derived' is too
strong.

DEC PDP-11 assemblers use `@', not `*', but let us assume that Ken
Thompson had been using `*' with whatever assembler he was using.
(The 4BSD Vax assembler uses `*', so it is reasonable to guess that
this was handed down from an earlier era.)  First contrast

	mov	*(r4)+,-(r5)

with

	*--p = **q++;

(if I have not botched the '11 assembly; I have never used an '11).
Close?  Well, somewhat: I can see a resemblance, at any rate.

Now step back a bit and consider the notation in and of itself.
We have here three basic operations: `--p', `q++', and `*'.  From
early mathematics notation we can take `-' as `subtract' and `+'
as `add'.  `*' is an abberation; it looks more like one of the
generic binary operation symbols used in group theory than anything
else (though this may depend on your terminal's font).  As for why
there are two each of `+' and `-', I think we can put that down to
the exigencies of parsing.  Now we have `-p' and `q+'---but what
might these mean?  Well, if `-' is subtract and `+' is add, then
we have `subtracted p' and `q added'.  There is nothing explicitly
being subtracted or added, so it is perhaps reasonable to assume
one of the classical computer science numbers, namely `zero', `one',
and `many'.  Adding and subtracting zero is useless, and adding
and subtracting many is ambiguous, so we will add and subtract one.
I think it is also a small step to say that the `-' is `before'
`p', and the `+' is `after' `q', so we should do the subtraction
`before' and the addition `after'.  Before and after what?  Here
I resort to fiat and say `before and after *, which we define to
mean indirection'.

Of course, all this does is demonstrate that the PDP-11 assembly
notation was in some respects `reasonable', and not that the notation
appears in C for that particular reason.  In order to refute the
quoted statement above, I must find `a lot of C notation' that does
not seem to be `derived from PDP assembly'.  So let us consider
some more C notation, in particular in expressions.

1.  Arithmetic.  C arithmetic seems to be quite conventional for
    post-FORTRAN languages.  `a + b * (c - d)' does not look much
    like a series of `sub', `mul', and `add' instructions to me.

2.  Structures.  Structure member access via `.' is again very
    conventional; it looks like PL/I, among others.  Pointer
    member access is a little different.  `p->member' can indeed
    be done with a single '11 instruction in many cases, yet
    the `->' notation itself does not appear in '11 assembly.

3.  Logical operations.  `&&' and `||' have no direct counterpart
    in '11 assembly, and must be implemented with rather complex
    series of tests and branches.

No doubt more examples can be found by those cleverer than I;
but I think this much is sufficient.  I think I will close by
saying that the notation used in C is simply a well-coordinated
set of notations borrowed from other places and languages,
including but not limited to PDP-11 assembly, and modified as
appropriate to obtain that coordination.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1516)
UUCP:	seismo!umcp-cs!chris
CSNet:	chris at umcp-cs		ARPA:	chris at mimsy.umd.edu



More information about the Comp.unix mailing list