Duff's device.

Eric C. Pearce pearce at TYCHO.YERKES.UCHICAGO.EDU
Sat Sep 10 07:09:55 AEST 1988


After trying Doug Schmidts duff timer on our Sun 3/260, with the sun
supplied cc complier, I found duff's device considerably faster
(almost a factor of two).  However, the test is unfair in the sense
that no attempt is made to optimize the non-duff copy without
resorting to syntax-nightmares like the duff code.  Additionally,
Schmidts test is somewhat unfair since it leaves some loop
initialization (such as the initialization of A and B) out of the
loop.  This will be quite small if the copy is large though.

Personally, I prefer this "conventional" copy fragment:

      A = array1;
      B = array2;

      n = Count / 8;
      for (i =  Count%8; i > 0; i--)
	 *A++ = *B++;
      while (--n >= 0) {
	 *A++ = *B++; 
	 *A++ = *B++; 
	 *A++ = *B++; 
	 *A++ = *B++; 
	 *A++ = *B++; 
	 *A++ = *B++; 
	 *A++ = *B++; 
	 *A++ = *B++; 
      } 

>From a performance standpoint, it falls only 2% behind duff device and
is 102% more readable and not "kludgy".



More information about the Comp.lang.c mailing list