Explanation, please!

Tue Aug 30 14:31:27 AEST 1988

In article <2877 at ttrdc.UUCP> levy at ttrdc.UUCP (Daniel R. Levy) writes:
>In article <653 at paris.ICS.UCI.EDU>, schmidt at bonnie.ics.uci.edu (Douglas C. Schmidt) writes:
>>    Since I posted my original question there has been a great deal of
>> abstract discussion about the relative merits of the loop unrolling
>> scheme.  The topic has piqued my curiousity, so I when ahead and
>> implemented a short test program, included below, to test Duff's
>> device against the ``ordinary for loop w/index variable'' technique.
>> See for yourself....   
>> 
>> After some quick testing I found that gcc 1.26 -O on a Sun 3 and a
>> Sequent Balance was pretty heavily in favor of the regular (non-Duff)
>> loop.  Your mileage may vary.  I realize that there may be other
>> tests, and if anyone has a better version, I'd like to see it!
>
>I modified this program to run under System V, changed the arrays to be dynam-
>ically allocated, and changed both the Duff and ordinary copies to use register
>pointers instead of global pointers (for the Duff copy) and array indexing (for
>the ordinary copy).  I then tried it on a SVR2 3B20, a SVR3 3B2, a Sun-3, and a
>Sun-4 both with and without -O optimization (using the standard pcc-type C
>compiler on each system).  The result?  Duff wins by about 10%-20% on all
>machines tested.

I then added a piece to the program to use 'memcpy'.  The results?
Duff beats a simple loop by 10%.  'memcpy' is 9 times faster than
Duff.  So why do people spend so much time avoiding standard subroutines?

-- Chuck