Efficient Coding Practices

John R. Levine johnl at ima.ima.isc.com
Tue Oct 4 07:23:42 AEST 1988


In article <34196 at XAIT.Xerox.COM> g-rh at XAIT.Xerox.COM (Richard Harter) writes:
>>! [ first allegedly optimal code ]
>>!	tmp1 = dst;
>>!	tmp2 = src;
>>!	for (i=0;i<n;i++) *tmp1++ = *tmp2++;
>
>> [second allegedly optimal code]
>>	tmp1 = dst;
>>	tmp2 = src;
>>	tmp3 = dst + n;
>>	while (tmp1 != tmp3) {
>>		*tmp1++ = *tmp2++;
> [ third allegedly optimal code]
>	register int i;
>	...
>	tmp1 = dst;
>	tmp2 = src;
>	for (i=n;i;--i) *tmp1++ = *tmp++;

On an Intel 386, assuming your compiler isn't smart enough to recognize such
loops and generate string move instructions, and assuming the
two blocks don't overlap, your best bet probably is:

	register i, rdst = dst, rsrc = src;

	for(i = n; --i; )
		rdst[i] = rsrc[i];

This uses the 386's scaled index modes and loop control instructions and
generates a loop two instructions long.  On non-Vax machines *p++ does
not generate particularly good code, after all.

The message here is that unless you have a specific performance problem in
a specific environment, such micro-optimization is a waste of time since
the "best" code depends heavily on the particular instruction set and
addressing model in use.
-- 
John R. Levine, IECC, PO Box 349, Cambridge MA 02238-0349, +1 617 492 3869
{ bbn | think | decvax | harvard | yale }!ima!johnl, Levine at YALE.something
Rome fell, Babylon fell, Scarsdale will have its turn.  -G. B. Shaw



More information about the Comp.lang.c mailing list