Loop unfolding

Rik Littlefield rik at june.cs.washington.edu
Thu Sep 1 11:03:05 AEST 1988


All of the examples of loop unfolding recently discussed on the net have
implemented just copying.  Several authors have suggested improving on
the loop unfolding method by using a (pseudo-) standard routine like 
'memcpy', or by declaring large structures that the compiler can generate
good code for moving.

I have seen at least three cases where loop unfolding was very productive
but neither of the above suggestions seems to apply.  All were in time-
critical production applications.

   1. For an ultrasonic inspection program, the inner loop contained a
      summation along the lines of

         s += *p++;

   2. In an image processing program, the inner loop was an indexed move
      like

         *p++ = *(*q++);

   3. A driver for a memory-mapped I/O device used multiple stores
      into a single address:

         *q = *p++;

As I said, unrolling was a very effective way of removing virtually
all the overhead from these loops.  Can anyone suggest other solutions
analogous to the alternatives mentioned above, or for that matter,
any better solution other than assembly language?

--Rik



More information about the Comp.lang.c mailing list