Efficient coding considered harmful?

Wed Nov 2 10:53:41 AEST 1988

One of the things about Usenet that is very unfortunate is that mere
agreement isn't worth commenting on (and wastes bandwidth).  So,
while some of my replies are a bit harsh, do keep in mind that there
is also a lot which I agree with in that posting.

In article <7700 at bloom-beacon.MIT.EDU> scs at adam.pika.mit.edu (Steve Summit) writes:
: In article <119 at twwells.uucp> bill at twwells.UUCP (T. William Wells) writes:
: >It's the damnedest thing, but people whose opinions I otherwise
: >respect seem to have this thing about coding efficiently.  They don't
: >like it. Worse, they discourage it.
:
: Some time ago I came to the conclusion that the style vs.
: efficiency debate is another religious war, which I usually try
: to stay out of.  Good style and, I'll admit it, anti-efficiency
: is a personal crusade, though, so I can't help but respond.

But I think you mostly missed my point. One could approximate it
with: "Efficient code IS a matter of style." Most of what I advocate
is choosing a consistent style that is also reasonably efficient.
While you seem to agree with this, you also directed much of your
discussion to code-tweaking, a thing which I agree should be done
after one has discovered a need for greater efficiency.

These are two separate issues, and I wish you'd have addressed the
former and not the latter.

: None of this should be taken as a personal attack against Mr.
: Wells, or anyone else, for that matter.  His points are well
: taken; I simply have a different point of view, going back to
: first principles, as all good religious wars do.

Perhaps we should discuss those "first principles". I had been
composing an article on them,, but I decided that I didn't really
have the time for it. Anyway, here are mine:

    1) Programming is a *utilitarian* art. If the program doesn't do
       the intended job, it is a bad program. This is the overriding
       standard by which one judges a program.

       Portability ranks up here, as a program is portable because it
       can do the job when moved to another machine.  But because
       portability ranks here, there is also a limit on meaningful
       portability: if the program isn't going to be ported then who
       cares if it can be ported?

    2) Programs have to be read and maintained by humans.  This
       implies modularity and a clean and consistent coding style,
       and for procedural programs, a structured program.  This is a
       secondary standard. This is important because it makes it
       easier to fix bugs and to add to the jobs the program can do.

    3) Programs should consume as little resource as possible.
       Programs that consume excess resources waste resources of the
       user, and may also impact other users of a multi-user system.
       In the extreme case, an inefficient program can be unusable.
       This too is a secondary standard.

    4) Programming is a utilitarian *art*.  Be that as it may, and
       while this may contribute to improving the program according
       to the other standards, this is less important than all other
       considerations.  Esthetics is just not that important compared
       to efficiency or maintainability, unless it contributes to one
       or the other.

Any real disagreement?

: >I advocate training oneself to be one who codes efficient programs. I
: >do not mean writing code and then making it more efficient, I mean
: >writing it that way in the first place.
:
: Here I agree with at least some of the words, but not the
: implications.  It happens that I usually write nice, efficient
: programs, but not by resorting to microefficiency tweaks or by
: sacrificing readability in any way.  It is important to employ
: lots of common sense (and, of course, "common sense isn't"); to
: shy away from grotesquely inefficient algorithms; and to partition
: the code so that key sections can be cleanly replaced later if
: efficiency does prove to be an issue.  It isn't important to
: write every last expression and statement in the most
: theoretically blistering way possible.

But who said anything about "ideal"? Not me!  What I want is enough
attention to detail so that the code is reasonably efficient to start
with. A side benefit of this attention to detail is that one often
discovers bugs in the code, thus saving later pain.

: >3) "Efficient coding makes obscurer programs." Well, since most
: >   efficient coding involves minor changes from the inefficient way
: >   of doing things, changes which are mostly a matter of style rather
: >   than of organization...
:
: I don't know what is meant by "most efficient coding."  The
: interpretation "the most efficient coding involves the most minor
: changes" is probably not the one that was intended, although I
: like it, because taken to its extreme it says not to make any
: changes at all.

Oh dear, a semantic problem.  Part of it is in the use of the verb
"code" which, to me, refers to the last phase of writing a program:
the transcription of what you want to do into something the computer
understands. ("Writing a program" is the phase of getting the
damnthing out just before "making it work".) So, to me, coding
doesn't have much lattitude: it presumes that the overall structure
of the program has already been decided, that the major data
structures are mostly decided, etc.

The other part is "changes". This is intended as a comparative, not
as implying that one should actually change ones code in conformance
with my idea of efficient coding. To repeat, one should write it that
way in the first place.

:                  I cannot agree with the more likely
: interpretation.  The minor changes, the replacements of
: multiplies and divides by shifts and the like, are mostly noise
: (both with respect to efficiency and style).

The evidence is against you there: 10-25% improvements are not
"noise".

:                                               The really big
: efficiency hacks, the ones people spend weeks and months on, do
: involve massive organizational changes and wholesale concessions
: to readability and maintainability.

I am not talking about such things, nor do I advocate doing such,
except when the program isn't fast enough.

: >   ...this argument should really be read: "*I*
: >   don't like to or find it hard to read that kind of program, so it
: >   is unclear."
:
: The attitude of most C experts is "*I* don't have any problem
: reading this code; anyone else who considers himself a Real C
: Programmer shouldn't, either."  This attitude is patronizing and
: wrong.

Cut the BS. While I have beliefs that could be approximated by your
nasty remark, they are not the same. You apparently think that it is
proper for a programmer to constrain his style to one that poorly
trained programmers will have no trouble with. This kind of attitude
is not useful. Rather, when we encounter a programmer who can't
understand a good coding style, we should make sure that he gets
taught. That is what we do where I work and so that is how I will
code when working here. As for my personal stuff, I don't care that
the untrained of the world can't read it.

Call that patronizing if you will. I call it having high standards.

:         To borrow a previous argument, "We have to code in the
: real world, not the world of make-believe."  There are plenty of
: people out there who are still learning C,

So teach them right!

:                                            and since C has become
: as popular as it has and C programmers are in demand, many of
: them are working for real companies writing (and maintaining)
: real programs.

Hmmmm. Most programmers come into the real world completely unable to
program. All they have is a little knowledge, and very little
experience. Since they can't write programs well, we should be very
careful that they don't see examples of good programs, since they
might have problems reading them.  That way they won't have to learn
how to write such programs. :-) sort of.

: >Avoid excess function calls. A lot of people have been brainwashed by
: >some modularity gurus into believing that every identifiable "whole"
: >should have its own function. What I have found is that this makes
: >the program LESS maintainable than otherwise.
:
: Here's another religious debate: the structured programming one.

I think you missed the point. Structured programming is essential to
good procedural programming. I am saying that excessive partitioning
not only makes the code less efficient but is actually contrary to the
purposes for which structured programming exists.  So this is a win
all around.

: There are a few applications (and, some believe, a few
: architectures) for which function call overhead is significant,
: but they are infrequent, and in general there should be
: absolutely no stigma attached to a function call.

My particular comment was directed at the following kind of
programming: (This was a real program. Only the names have been
changed....)

string_diddling_function(in, out)
char    *in;
char    *out;
{
	char    temp[MAXSTRING];

	/* comment 1 */

	diddle_1(in, temp);

	/* comment 2 */

	diddle_2(temp, out);

	/* comment 3 */

	diddle_3(out, temp);

	/* comment 4 */

	diddle_4(temp, out);
}

/* Each diddle had about this degree of complexity: */

diddle_1(in, out)
char    *in;
char    *out
{
	while (*in) {
		if (*in == '`') {
			++in;
		} else {
			*out++ = *in++;
		}
	}
	*out = 0;
}

The whole thing was unreadable!

Not only that, but 40% of one program (patgen) was spent in just this
set of routines.  Ugh!

: >:     o   Avoid bit fields, most especially signed ones.
: >Try: don't use them at all, they tend to be nonportable.
:
: This is a side question: what's so unportable about bitfields?

Signedness of the bit fields, for one thing. As I understand it,
compiler writers have chosen to implement them as either signed or
unsigned according to their own whim.  Also, it used to be the case
that a number of compilers either didn't implement them or did them
incorrectly. It may still be.

That, plus the frequent relative inefficiency of these compared to
do-it-yourself bit fields, makes them undesirable.

: >:     o   Use realloc instead of malloc/free pairs. Use calloc instead of
: >:         malloc followed by zeroing each member.
: >Efficient memory allocation is *much* more difficult than this, what
: >you really need to do is to cut down on the number of calls to malloc
: >and free. Of course, that usually means writing some kind of
: >allocation stuff yourself, never an easy task.
:
: Please don't avoid malloc; the alternative is generally
: fixed-size arrays and "lines longer than 512 characters are
: silently truncated."  Please don't roll your own allocation
: routines, again unless you have to; this is the kind of low-level
: dirty work, hard enough to do well, that you should let the lower
: levels (i.e. the standard implementations of malloc) give it
: their best shot.

My original understanding was that someone was advocating something
like changing:

	thing1 = malloc(size1);
	...finish using thing1
	free(thing1);
	thing2 = malloc(size2);
into:
	thing1 = malloc(size1);
	...finish using thing1
	thing2 = realloc(thing1, size2);

I now suspect that I am the only person who read it that way, so most
of what I said is irrelevant. However, what I said about rolling your
own memory allocation still stands, but let me clarify this.

I don't mean writing a malloc replacement, but rather an interface to
malloc. One should always write such an interface, in order to handle
memory allocation failures, unless one want's to check the return
value of each malloc call. Here is an (untesetd) example of what I
mean:

typedef struct MYSTRUCT {
	struct MYSTRUCT *my_next; /* a general link */
	char    *my_field1;
	float   *my_field2;
} MYSTRUCT;

MYSTRUCT *Free_mystruct;        /* A list of currently unused MYSTRUCT's */

/* Everyone gets to malloc through here. It tries to malloc, but if
   that fails, frees whatever is on the free list(s) and then tries
   again. Repeated failure causes the program to exit. */

void *
mymalloc(size)
size_t  size;
{
	void    *ptr;

	while (!(ptr = malloc(size))) {
		if (!Free_mystruct) {
			fprintf(stderr, "You lose: out of memory!\n");
			exit(1);
		}
		while (ptr = Free_mystruct) {
			Free_mystruct = ((MYSTRUCT *)ptr)->my_next;
			free(ptr);
		}
	}
	return (ptr);
}

/* Call here whenever you want a fresh MYSTRUCT. */

MYSTRUCT *
alloc_mystruct()
{
	register MYSTRUCT *ptr;

	if (ptr = Free_mystruct) {
		Free_mystruct = ptr->my_next;
	} else {
		ptr = mymalloc(sizeof(MYSTRUCT));
	}
	ptr->my_next = 0;
	ptr->my_field1 = 0;
	ptr->my_field2 = 0;
	return (ptr);
}

Now, I can already hear the screams of "Unclean, unclean!!!!" from
those who don't like my style of coding. Let's save bandwidth and not
flame, OK?

:                      Why does calloc exist?  Of what use is it?
: Why has ANSI legitimized it?  In principle, it is equally useless
: for clearing structures that contain floating-point fileds.

Oh yes. I forgot about floating point. And I suppose the reason that
ANSI left it around is the number of existing programs that use it.
Me, I never have and never will.

: >Unless, of course, the programmer remembered to COMMENT.
:
: If the code reads
:
:       a &= 3;         /* really a %= 4 */
: or
:       a &= 3;         /* really a %= HASHSIZE */
:
: and I do a global replace of 4, or re#define HASHSIZE, the
: comment may not help.

Yes, but writing an explicit constant is bad to start off with. It
should be:

#define HASHSIZE 4              /* A power of two, or else! */

	a &= HASHSIZE - 1;

:              find code size far more frequently relevant in
: practice (among other things, because of the aforementioned
: pdp11).  Remember that the classic tradeoff is between time and
: space, so the fancy efficiency hacks often make the code larger.

I believe that the "classic trade off" has almost nothing to do with
coding. Except when using data to replace code, faster code is
usually smaller and smaller code is usually faster.  This is good
news for coders!

: I might also point out that the marketplace generally rewards the
: product that comes out first, so if you can compress your
: development schedule by using clean, straightforward design and
: eschewing time-consuming and bug-prone efficiency tweaking,

There you go again, attacking the wrong issue. Grrrrrr.

: While these considerations are real, they should be viewed as
: unfortunate and temporary limitations which will be resolved,
: hopefully soon enough, at the root level, and not something which
: we resign ourselves to working around forever, and which we get
: so used to working around that the workarounds become part of the
: language, persisting well beyond their need.  These are what Fred
: Brooks calls "accidental problems," and every programmer minute
: and line of code spend catering to them is not being spent on
: real problems.

Keep dreaming. It is still the case that our customers don't like our
products when they consume over 64K, even though many (most?) machines
have *lots* more memory. It will always be the case that there is
never enough to go around.

: I'm not going to try to second-guess all of the replies I will
: undoubtedly get from the efficiency aficionadoes out there, but
: I will mention two things.

Click, whoosh!!!!! :-)

: I keep saying "don't do <something> unless you have to," by which
: I mean after actual experiments on prototype code demonstrate
: that there is a problem.  The attitude of "don't worry about
: efficiency until later" is frequently denigrated as leading to
: unsalvageably inefficient code, because trying to patch in some
: efficiency later is tantamount to a rewrite.  Although this can
: be true, the solution is to teach people good, responsible
: programming practices early, avoiding gratuitous and unnecessary
: inefficiency, without teaching them to "optimize too early."

What in *hell* do you think I am advocating?  It is very frustrating
to see myself being misunderstood (ignored?) so thoroughly.

: Responsible programming practices are just what the articles I am
: reacting to are trying to formulate, and all I'm trying to do is
: to draw the line a little more finely between what's reasonable
: and what's overboard.

But I think you rather failed. You fairly consistently attacked
efficiency tweaking, and with arguments that are mostly relevant to
that discussion. But, as you said, I am advocating what you called
"responsible programming", and I'd have much rather seen discussion
on that subject.

Instead, I get what seems to be a disagreement with my position, but
what is really a disagreement with a straw man.  How can I answer
that? What can we learn from that?

:                        My second point, and the reason I'm taking
: all of this time and space replying, is that people who are
: learning programming (and we're all still learning programming)
: are much more impressionable than you might think.  There's
: nothing wrong with this; it's actually refreshing given that it
: is often supposed that people stop learning at around age 20.
: But it does mean that if you unilaterally tell a reasonably green
: programmer to use pointers instead of arrays, he'll spend months
: having lots of frustrating problems with pointers, and likely end
: up hating them and the language in general.

Did I anywhere do anything like that? I believe that I prefaced my
remarks with an awful lot of "but consider your particular
circumstances first".

Sigh. I *hate* being misunderstood.

---
Bill
{uunet|novavax}!proxftl!twwells!bill