Efficient coding considered harmful?

Wed Nov 2 07:30:49 AEST 1988

In article <8775 at smoke.BRL.MIL> gwyn at brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
: In article <119 at twwells.uucp> bill at twwells.UUCP (T. William Wells) writes:
: >It's the damnedest thing, but people whose opinions I otherwise
: >respect seem to have this thing about coding efficiently.  They don't
: >like it. Worse, they discourage it.
:
: I think it would be more accurate to say that we want to discourage
: excessive concern with microefficiency to the detriment of other
: important attributes of source code.

Put that way, it is adifficult to argue the other side.  Obviously,
it is not good to write "efficient" code that is not understandable.
I recall writing, early in my programming career, a few lines of
assembly code that did a real neat (and efficient) trick.
Unfortunately, this code was sufficiently tricky that when it came
time to make a minor change to it, I could no longer figure out what
it did! Of course, it got rewritten.

This excessive "efficiency" is unarguably bad. However, anti-
efficiency types (or so they appear) have taken this too far in the
other direction. The kind of thing I do is mostly attention to
detail; the algorithm remains the same, and the broad structure of the
program remains the same. What is different are the coding details.

In fact, what we are talking about is mostly a matter of "style".
However, style does not exist in a vacuum; one chooses a particular
style because it serves some purpose(s). Certainly, a style must be
clear: this implies a consistent method of expressing algorithms
coupled with an eye for laying out the code on the page.  Certainly a
style must be portable: this means leaving out things that don't go
wherever the program might be ported. And just as certainly, a style
should get the job done as efficiently as possible: this means
choosing the details of how the thing is to be coded to be as likely
as possible to be efficient.

These goals aren't entirely consistent; that implies that a coder is
going to have to make a trade-off whose details will depend on the
circumstances.

: >2) "What is more efficient on one machine may well be less efficient
: >   on another." Well, this is a good argument for knowing a lot of
: >   different machines and adapting one's coding style to the relevant
: >   machines, ..
:
: Actually the force of this argument is stronger if you already
: appreciate the value of coding for reasonable behavior on almost
: any C implementation, rather than "optimum" behavior on just one.

Actually, one of the nicenesses of C programming is that reasonable
behaviour is very easy to get: most C operations have a small cost.
However, they do not have equal costs, and sometimes it is reasonable
to make generalizations of the form: "On machines X where efficiency
is of greater concern to me, doing it by method A is usually more
efficient than doing it by B." When such generalizations are possible,
one should take advantage of them.

: Some people really may never port their code outside the first
: system it's compiled for, but many of us do and we simply don't
: have time to fine-tune the code in each environment (unless we
: will personally accrue enough benefit to justify it).  My own
: time is worth much more than any computer's.

Some people have their code ported damn near everywhere. For example,
the spelling software I was responsible for a few years ago runs on
everything from 8086's to IBM-370's.  (This is not to say that there
weren't portability problems: e.g., we had to write a program to
preprocess character strings into \nnn so that it would run on
non-ASCII machines. This was a case where clarity conflicted with
portability.)

However, it was not hard to figure out a reasonable set of
generalizations and then apply them; after all, on which machines
would it matter most how fast the program is: the micros with an
impatient user waiting in front of them, or the mainframes with lots
of horsepower? Of course, I coded it with efficiency on the micros as
the greater concern.

: >3) "Efficient coding makes obscurer programs."
:
: Certainly going overboard with microefficiency tweaks does.

Moderation in all things, especially moderation! :-)

: >     for (i = 0; i < a[j]; ++i) ...
: >into
: >     for (i = 0, ilast = a[j]; i < ilast; ++i) ...
:
: In most cases, the following can be used:
:       for (i = a[j]; --i >= 0; )
: which is more efficient and clearer.

Right. And I use the latter whenever possible; however, it does not
preserve the semantics of the original loop. (I just realized that I
probably should have added a caveat that the transform is invalid if
j is changed in the loop. This applies to both versions, though.)

: (But don't write something like
:       for (p = &a[j]; --p >= a; )
: which is nonportable.)

I'll remember! :-) Eh, Chris?

: >     for (i = 0; i < ALAST; ++i) {
: >             ... A[i];
: >     }
: >
: >     for (ap = A; ap < &A[ALAST]; ++ap) {
: >             ... *ap;
: >     }
: >
: >I write it this second way, as a matter of course, because I thought
: >about it for a while and decided that this was the appropriate way to
: >do it.  I then trained myself into the habit of doing it this way.
:
: This makes a good example of the dangers of overconcern with such
: matters.  There are quite a few compilers that, in many cases, will
: generate much better code for the first method than for the second,
: because they can recognize an opportunity to use vector instructions.
: Others will generate essentially the same code for the two methods.

Why did you say this? I thought I had made it clear that I am well
aware of the differences in compilers and they systems they run on.
I thought I had made it clear that different situations require
different choices in one's code.  However, it appears that I have
failed to communicate. How?  Until I understand where the failure of
communications is, I'll let this one drop.

: Unless the situation is naturally thought of as involving a pointer
: (for example, to a structure member of an array), it may be clearer
: to treat array elements as array elements.

"Naturally?" At least for me, a pointer into an array is neither more
or less "natural" than an index into the array, I'll use either, as
the situation requires.

: >... dividing positive numbers by powers of two should be done with a shift.
:
: No!  Juggling the source code to use bit-oriented operations in an
: arithmetic context not only makes the code less portable and harder
: to maintain, it can also lead to future errors, when for example a
: chunk size is changed to a variable rather than a constant, or to a
: non-power-of-two.

Less portable?

	If a compiler doesn't take my n >> m (n >= 0, m < 16) and do
	the same thing as when dividing it by 2**m, the compiler is
	broken.  Now, there might be some compilers that are broken
	that way, but that makes it an individual decision as to
	whether one panders to the frailties of those compilers. For
	the record, the spelling code I earlier mentioned does this in
	a few places and I've never heard of a problem relating to it.

Harder to maintain?

	There is something there: if one has a number n, one has to
	have a second number representing log2(n); that is a cost.
	But that is a minor one.

	There is also the fact that if the number ceases to be a
	power of two, all references to the log have to be changed,
	but that is mechanical and so another minor cost.

Lead to future errors?

	Only when done wrong. When I do it, it gets done this way:

	#define BLOCK_BITS 9
	#define BLOCK_SIZE (1 << BLOCK_BITS)

	This is a clear signal (which can be reenforced by a comment)
	that BLOCK_SIZE is supposed to be a power of two.  Changing
	it to a variable shouldn't cause any problems: all one needs
	to do is introduce two variables.

	If the size becomes other than a power of two, this is easily
	enough dealt with: it is clear that the program is depending
	on the existence of BLOCK_BITS; an instant's thought will
	tell the maintainer that he has got to do something about all
	references to BLOCK_BITS; such will certainly include all the
	>> BLOCK_BITS stuff.

So, I disagree.

: >Unless, of course, the programmer remembered to COMMENT.
:
: Comments don't necessarily track the code,

Let's see: "Comments don't necessarily track the code, therefore you
shouldn't do something (coding `tricks') where that tracking is
important." Sounds more like an argument for more care in commenting,
not an argument against coding "tricks".  The failures that are
possible in a system which are caused by carelessness by the users of
the system are arguments for teaching the users how to better use
their system, not arguments for discouraging particular uses of the
system. Or even better, for improving the system (but that is a
completely different can of worms. :-)

:                                            and it is unrealistic to
: expect every use of the kind of tricks you advocate to be commented.

Can you say "straw man"?

It is not usually necessary to comment such "tricks". There is
nothing about a counted-down for loop that requires a comment.  There
is nothing about a redundant calculation that gets eliminated that
requires a comment.

There are situations where a comment is desirable; however, even
better is a method like the one I showed for replacing division by
shifting: when the code is changed so that the trick doesn't work,
one is forced to fix the places where the trick is used.

---
Bill
{uunet|novavax}!proxftl!twwells!bill