Efficient coding considered harmful?

Paul Schmidt pjs269 at tijc02.UUCP
Tue Nov 1 06:43:52 AEST 1988


    After working for a year to optimize a DBMS I
have some comments on writing efficient code.

    "More harm has been done in the sake of
    efficiency than any thing else."

    When I was optimizing I found some horrendous
"efficient coding" practices that were used that
made the code either less managable, less efficient
or both.

    Take, for example, my favorite:

some_function(a_variable)
short a_variable;

    The coder (who was inexperienced in C) wanted
to optimize the space needed to save the parameter
passed to the function.  This actually may add to
the memory and time to do conversion between short
and int.

    The worst violation of coding for efficiency
was done in assembler.  The person set a condition
bit inside a subroutine.  After the return the bit
was used in a conditional jump.  Of course, another
programmer saw the subroutine and couldn't under-
stand why there was an unneeded operation in the
subroutine and removed it.

    The time spent coding a project is only about
10%.  The maintenance phase lasts around 50% of a
project.  If the coders write the most readable
code for the maintenance, the entire project cost
can be reduced.

    But there is still a need for optimization.
This should be done after the code is written and
working.  Why?  Because the amount of time spent
in each code segment varies widely.  There is no
reason to optimize the initialization routines if
they are only run once and are fairly fast
already.

    Using prof(1) under UNIX I have always been
suprised at where the time is spent for a given
program.  And using this shows which routines
need to be optimized.  Using a benchmark it was
easy to see that only 10% of the routines were
run 90% of the time.  Some of the results showed
obvious duplication of calculations that were
easy to eliminate.  But instead of trying to
find them by hand, we let the computer show us
where they were.  After changing the obvious
problems, there were many low level optimizations
that were done.  Some included calculating
certain variables once and storing them as
globals while others were to make certain
variables declared as register.  At one point it
became obvious that the semaphore routines
supplied by UNIX took 25-50% of the total time
to do a database retrieve.  (This was solved by
making ownership of relations, and removing the
need to call the semaphore routines.) All through
the optimization process we were aware of what
was the most important code to optimize so we
could, as our boss always put it, "Get the
biggest bang for the buck."

    For less experienced C programmers, try
running prof on a program and see which routines
are actually taking the most amount of time.
Prof will order the output from the most used
routine to the least and give the percentage of
time spent in each routine.  I copied this prof
output from July 87, 1987, p 588, on profilers:

%time cumsecs #call ms/call name
 82.7    4.77               _sqrt
  4.5    5.03   999    0.26 _prime
  4.3    5.28  5456    0.05 _root
  2.6    5.43               _frexp
  1.4    5.51               __doprnt
  1.2    5.57               _write
  ...

This is for a program to compute prime numbers:

root(n)
int n;
{ return (int) sqrt((float) n); }

prime(n)
int n;
{   int i;
    for (i = 2; i <= root(n); i++)
        if (n % i == 0)
            return 0;
    return 1;
}

main()
{   int i, n;
    n = 1000;
    for (i = 2; i <= n; i++)
        if (prime(i))
            printf("%d\n", i);
}

It is interesting to see that the square root
calculation takes this much time for a function
and is not needed to calculate primes.  It was
probably an "optimization" to make the search
for primes quicker.

    In conclusion, I would like to stress that
readability for the maintenance phase should
outweigh the importance of optimizing code as
it is written.  Easy to read code is easier to
maintain, and easier to optimize.

    Paul Schmidt
    Texas Instruments
    PO Drawer 1255, MS 3517
    Johnson City, TN 37605-1255

    mcnc!rti!tijc02!pjs269



More information about the Comp.lang.c mailing list