C compiler implements wrong semantics

Mon Feb 3 07:55:30 AEST 1986

     On a DEC-20 running Stanford's KCC compiler, all the
post-increments yield 11 and all the pre-increments yield 12, as
follows: 
	b = a + a++   yields  11
	b = a++ + a   yields  11
	b = a + (a++) yields  11
	b = (a++) + a yields  11
	b = a + ++a   yields  12
	b = ++a + a   yields  12
	b = a + (++a) yields  12
	b = (++a) + a yields  12

     This would seem to correspond to the VMS C compiler and the
formal definition.  I think the discrepancy is that VMS C and KCC
were written with a formal definition in mind, while Unix C was
written as a kind of RatFor for PDP-11 assembly code.

     The basic form of the generated code was
	MOVEI 5,6		; load constant 6 into register 5
	MOVEM 5,-3(17)		; store constant in a
	ADD 5,-3(17)		; add a to a
	SUBI 5,1		; subtract 1 (post-increment only)
	MOVEM 5,-2(17)		; store resulting value into b
 in all cases.  The -n(17) stuff simply refers to variables
allocated on the stack (PDP-10 stacks grow upwards).  The only
difference between the pre-increment and post-increment cases was
that the pre-increment case didn't have the SUBI.

     This leads me to another question.  This generated code does
the job, but certainly isn't up to what an optimizing compiler
can do, much less hand-coded assembly code.  On the PDP-10,
hand-coded assembly code could do the computations in 2
instructions if the value of a is unimportant afterwards (and if
printf can take its argument in a register).  We're talking a 50%
slowdown in generated code, or more if we're in an inner loop and
the compiler can recognize the pattern as a load-once constant.

     Has much been done in the technology of optimizing C
compilations?
-------