Efficient Coding Practices

Richard Harter g-rh at XAIT.Xerox.COM
Tue Oct 4 02:33:50 AEST 1988


In article <846 at proxftl.UUCP> francis at proxftl.UUCP (Francis H. Yu) writes:
>In article <34112 at XAIT.XEROX.COM> g-rh at XAIT.Xerox.COM (Richard Harter) writes:

>!Let's hand optimize this.  Instead of using indices (presumably less
>!efficent) we will use pointers.  Since we don't want to destroy dst
>!and src as pointers we need temporaries.  Thus we might write

>!	tmp1 = dst;
>!	tmp2 = src;
>!	for (i=0;i<n;i++) *tmp1++ = *tmp2++;

>The better code is 
>	tmp1 = dst;
>	tmp2 = src;
>	tmp3 = dst + n;
>	while (tmp1 != tmp3) {
>		*tmp1++ = *tmp2++;
>	}

	The point of this little exercise was to illustrate that using
pointers quite often means extra lines of code and temporaries.  However
your "better" code is not necessarily better.  In many machines it is more
efficient to control loops by counting a register down to zero and escaping
on zero than it is to exit on a compare.  A good compiler will do exactly
that with the sample code.  If we are hand optimizing, we write

	register int i;
	...
	tmp1 = dst;
	tmp2 = src;
	for (i=n;i;--i) *tmp1++ = *tmp++;

Your while loop, in pseudo machine language, runs something like this

	mv dst r1			# load register 1 with dst
	mv src r2			# load register 2 with src
	mv *n,r3			# move n to register 3
	ble a2				# n already done, skip loop
	add r1,r3			# add n to get termination
	br a1				# loop test at bottom, go there
a0:	mv *r2++ *r1++			# Move src to dst, incrementing
a1:	compare r1,r3			# compare term with with dst ptr
	bne a0				# not eq, go to loop start
a2:

The corresponding code for a countdown loop is

	mv dst r1			# load register 1 with dst
	mv src r2			# load register 2 with src
	mv n r3				# load register 3 with n
	ble a2				# n already done, skip loop
	br a1				# loop test at bottom, go there
a0:	mv *r2++, *r1++			# Move src to dst, incrementing
a1:	dec r3				# Decrement count down index
	bgt a0				# index not negative, do more
a2:

The loop bodies are the same except that the compare is replaced with
a decrement, which may be faster (it is on most machines) and is never
slower (as far as I know).  Moreover many machines have a decrement and
test instruction.  For example, the PDP-11 has an SOB instruction which
combines the two.

Note: It is more efficient to put the loop test at the bottom of the loop
and branch there to initiate the loop; it saves a branch instruction.

Lesson:  If efficiency is a concern, countdown loops are faster than
compare value loops.

Lesson:  If one is optimizing code one has to think about what the machine
will have to do in different implementations when comparing them.
-- 

In the fields of Hell where the grass grows high
Are the graves of dreams allowed to die.
	Richard Harter, SMDS  Inc.



More information about the Comp.lang.c mailing list