Array indexing vs. pointers...

Tue Sep 27 13:39:29 AEST 1988

In article <8809191521.AA17824 at ucbvax.Berkeley.EDU> U23405 at UICVM
(Michael J. Steiner) writes:
>First, I have a question. How and why is array indexing slower than
>pointer arithmetic? They are very similar. Also, I think that compilers
>should automatically translate array indexing into pointer arithmetic when
>dealing with arrays in C. Any comments, opinions?

I waited a long time to respond to this one, because I expected
everyone to post the obvious answer.  Yes, a[i] is equivalent to *(a+i),
*in* *C*; but what about in machine language?  Y'know, the stuff the
CPU really eats?  "The definition of 'adding 1 to a pointer,', and by
extension, all pointer arithmetic, is that the increment is scaled by
the size in storage of the object is pointed to."  [K&R 1st ed, p. 94]
This implies multiplying the pointer offset or array index by the size.
Granted, the size is known, is one for char's, and is often a power of
two; but what about structures that are seventeen bytes long?

So, let's consider the (*un*optimized) translation of the following
loops into a pseudo-C assembler language, where pointers arithmetic is
scaled by the smallest addressable cell (probably a byte):

C:
	for ( i = 0; i < N; ++i )
		a[ i ] = b[ i ];	/* a and b are of type t */
		/* or equivalently, *(a+i) = *(b+i)
machine language:
	i = 0
	begin_for:
	if ( ! ( i < N ) ) goto end_for
	tmp = i * sizeof t
	*( a + tmp ) = *( b + tmp )	; probably a single instruction
	i += 1
	goto begin_for
	end_for:
TOTAL:  N+1 integer assignments, N increments, N copies; N multiples!

C:
	for ( p = a, q = b, i = 0; i < N; ++i )
		*p++ = *q++;
machine language:
	p = a
	q = b
	i = 0
	begin_for:
	if ( ! ( i < N ) ) goto end_for
	*p = *q		; on a PDP-11, these next three lines
	p += sizeof t	; can be replaced by a single instruction
	q += sizeof t	; ditto VAX, 680x0, etc.
	i += 1
	goto begin_for
	end_for:
TOTAL:  3 integer assignments, 3*N increments, N copies; 0 multiplies!

We've traded N multiplies and N assignments for 2*N increments, which
can easily (on a PDP-11 or similar CPU) be snuck into auto-increment
modes of the copy instruction.  Yes, an optimizer would reduce the
first to the second, or better.  But one of Ritchie's goals was to
make it easy for the compiler generate good low-level code from clever
high-level code.

Moral:  A really good C programmer should know assembler language and
compiler technology to write ordinary applications efficiently.

Homework:  Discover what kind of code your compiler generates for
*p++ = *q++ when pointing to a char, int, or odd length structure.  If
p and q are auto, not register, is *p++ = *q++ the same as:
	*(++p - 1) = *(++q - 1)		/* C, not assembler */
(that is to say, very ugly code), or possibly equivalent to:
	tmp = *p = *q, ++p, ++q, tmp

Paul S. R. Chisholm, psrc at poseidon.att.com (formerly psc at lznv.att.com)
AT&T Bell Laboratories, att!poseidon!psrc, AT&T Mail !psrchisholm
I'm not speaking for the company, I'm just speaking my mind.