Efficient coding considered harmful?

Sat Nov 5 19:55:52 AEST 1988

In article <7700 at bloom-beacon.MIT.EDU>, scs at athena.mit.edu (Steve Summit) writes:
> In article <119 at twwells.uucp> bill at twwells.UUCP (T. William Wells) writes:
> >It's the damnedest thing, but people whose opinions I otherwise
> >respect seem to have this thing about coding efficiently.  They don't
> >like it. Worse, they discourage it.

Here here.

> Some time ago I came to the conclusion that the style vs.
> efficiency debate is another religious war, which I usually try
> to stay out of.  Good style and, I'll admit it, anti-efficiency
> is a personal crusade, though, so I can't help but respond.

Why is it that people assume good style and efficiency is mutually exclusive?

This is a PC mentality.  Good style is appropriate commenting, small program
flow jumps (less than 2 pages, preferrably very much shorter than that),
and portability.

Style is often linked to which editor you use.  For a UNIX environment, a
good function declaration would be:

type
function()
parameter declarations;
{
	code
	code
	code
}

This allows searching for function names in a program with grep ^function *.c;
it allows use of the [ and ] vi commands to jump through blocks of code;
the type declarations before the function allow crosreferencing with a simple
sgrep/awk combination.

Unions should be avoided, as well as non-pre-aligned structures.  If you must
use a structure for othere than internal data representation, it should be
aligned on longword boundries.

if statements should look like:

	if( expression operator expression)
rather than
	if (expressionoperatorexpression)

which can break old compilers with either the space after the if or
the runniong together of tokens such as

	=& and = &, =* and = *, etc.

expressions where you are unsure of precedence should be paranthesized; if
you are unsure of precedence, relearn your C.

> I simply have a different point of view, going back to
> first principles, as all good religious wars do.  When caught
> between a rock and a hard place, when something's got to give,
> some people shrug their shoulders and sacrifice style,
> modularity, or portability.  I am prepared to sacrifice
> efficiency.  (Most of the time I can have my cake and eat it too,
> and I have any number of general techniques for doing so, which I
> will probably defer to a separate posting so that those who are
> sensibly ignoring these diatribes can still see them.)

usually, using an oblique "style standard" is what offends the portability
gods, not efficiency.

Being prepared to sacrafice efficiency on the alter of style is no way to
maintain a market share.  If it takes gross code to do something fast (code
is usually called gross if it can't be understood by the reader, which is more
a failing of the reader than a failing of the coder) you write gross code.
The only thing to sacrifice efficiency to is the God of portability, and
that's only if it isn't generally portable; if it can be made efficient and
portable by two methods, ...well, that's why God invented #ifdef

> >I advocate training oneself to be one who codes efficient programs. I
> >do not mean writing code and then making it more efficient, I mean
> >writing it that way in the first place.
> 
> Here I agree with at least some of the words, but not the
> implications.  It happens that I usually write nice, efficient
> programs, but not by resorting to microefficiency tweaks or by
> sacrificing readability in any way.  It is important to employ
> lots of common sense (and, of course, "common sense isn't"); to
> shy away from grotesquely inefficient algorithms; and to partition
> the code so that key sections can be cleanly replaced later if
> efficiency does prove to be an issue.  It isn't important to
> write every last expression and statement in the most
> theoretically blistering way possible.

Unless you are competing for a customer.  Run cu for a terminal session
or run uucp for a transfer and then run our stuff.  Blisteringly fast
means a savings in real dollars to a customer.  Someone at Sun voiced
this as a pro-tempe' justification for writing the SPARC compiler so it
compiles benchmarks instead of code.

> >I have heard three main counterarguments to this:
> >1) "The compiler ought to handle that, so you shouldn't bother with
> >   it." What nonsense! We have to code in the real world, not the
> >   world of make-believe and ought-to.
> 
> Many of the more egregious examples of source-level
> microefficiency tweaking are, in fact, routinely handled by
> today's (and even yesterday's) compilers.  Consider replacing
> power-of-two multiplies by left shifts, a textbook example.
> I just booted up my pdp11 (I am not making this up), and its
> optimizer knows how to shift left when I multiply by two.  (To be
> sure, not all newer compilers have been better.)

It is sheer stupidity to depend on the supposed contents of a black box; for
instance, a compiler.  This generates non-portable and lazy coding practices

"aw... the compiler'll make it fast..."

> >3) "Efficient coding makes obscurer programs." Well, since most
> >   efficient coding involves minor changes from the inefficient way
> >   of doing things, changes which are mostly a matter of style rather
> >   than of organization...

I disagree here; proper style is not necessarily efficiency; A program
generator can do the work of 5 ordinary programmers; no code generator can
replace one gifted programmer; that's why assembly language is still in
use.

> The really big
> efficiency hacks, the ones people spend weeks and months on, do
> involve massive organizational changes and wholesale concessions
> to readability and maintainability.

readability, yes; maintainability, not necessarily.  But I submit that this
loss of readability lies either in the lack of documentation (primarily in
the form of appropriate/timely comments) or skill on the part of the reader.

Ability to understand others code is the difference between a programmer and
a person who can program.  Writing code for idiots is only good if you are
an idiot and can do no better, or if you are willing to hire idiots.  This
type of "communism of coding", where the tradeoff is always made for the
less gifted programmer is what is currently threatening to move a lot of
the more exciting/profitable/innovative coding offshore.  A person who can
program can not always read a programmers code.  Tough.  Hire a programmer
instead of a code mechanic.  This is why good programmers are worth 80K+
per year (if they are willing to work in the rigidly structured environment
that entails; some intellectual freedom is worth at least 20K a year).

> The attitude of most C experts is "*I* don't have any problem
> reading this code; anyone else who considers himself a Real C
> Programmer shouldn't, either."  This attitude is patronizing and
> wrong.  To borrow a previous argument, "We have to code in the
> real world, not the world of make-believe."  There are plenty of
> people out there who are still learning C, and since C has become
> as popular as it has and C programmers are in demand, many of
> them are working for real companies writing (and maintaining)
> real programs.

A beginning C programmer (say < 3 years if he isn't the type of person who
stays up 3 days straight coding) can not expect to understand, let alone
maintain, 65,000 lines of code, perhaps not even 8,000.  To expect him or
her to understand the UNIX kernel by writing it pretty is ignorance.  To
degrade your code (don't kid yourself; that's what it is) so that such a
programmer (actually, at that point, they are "a person who can program")
can understand it is marketing and technological suicide.  It is the
equivalent of redoing IQ tests so that 100 is average again.  Saying
someone is more intelligent doesn't make them so.  Writing code so that
someone at a lower knowledge (notice I did NOT say educational) level
can understand it does not make the reader a better programmer.

> A few painless concessions (like multiplying
> instead of shifting,

Thhhhhtp!  Painless my arse.  Look at the assembly with optimization
turned off.  An old compiler is like a new compiler without the optimization
turned on.

> (I maintain that every unnecessary
> translation from an encoding back to an intended meaning, no
> matter how intuitive to the experienced programmer, incrementally
> increases comprehension time, effort required, and probability of
> error.)

There is some truth to this, but the key word is "unnecessary".  It is
also unnecessary for a computer programmer, whose first abstract concept
should have been bits.  If the person is a C programmer, the second an third
concepts should have been Hexadecimal and Octal.  Assuming that these
operations haven't become automatic after experience is silly.  It is equally
silly to think of a programmer doing bit operations with multiplies instead of
shifts!

> If you leave the explicit n*n everywhere, this
> error is impossible.

So is speed.  You have any idea how long a multiply takes on most architectures?
I'll take the risk of having it blown out.  If it gets blown out, I'll do 
another thing programmers should know how to do to deserve the name -- debug.

> >Use register variables. And take a litle pain to get them right.  A
> >technique that works enough of the time is this:...

Yes, register variables are a good idea, even if you have a fasciesticly
optimizing compiler.

> I'm sorry, I don't have spare time or spare brain cells for
> following or utilizing a technique like this.

Not to be rude, but it would do you good to deallocate some of your non-spare
brain cells and realloc them to utilizing such things as register varaiables.

> When I was a beginning C programmer, I decided not to use
> register variables at all in the first pass of coding, because I
> figured that, when I eventually got the program working, somebody
> was going to say that it wasn't fast enough, and if I had already
> blown all of my ammunition, how was I going to speed it up?

This is at odds with your previous statement (which I agree with) of "code it
right the first time".  A properly written program can not be sped up without
sacrificing an unacceptable degree of portability.

> Lately I throw in "register" all the time, but in some ways I
> respect my original attitude more.  The point is, if it's the
> sort of program that people notice how fast it is, they would
> have complained even if I had used register variables since day
> one.

Bull pucky.

> Here's another religious debate: the structured programming one.
> Nevertheless, good modularity is the only way to keep a large
> program maintainable.

A correct statement.

> Of particular pertinence to the theme of
> this article is that efficiency-critical modules, if properly
> isolated, can be implemented quickly and (possibly) inefficiently
> first, while getting the program to work correctly at all, and
> then transparently replaced later with a more efficient version,
> if and only if it is found that the quick, inefficient
> implementation is really too inefficient after all.

Another way of looking at this is to rephrase it:

"will my slacking off be discovered?  If so, can I spend the time I
 should have in the first place to fix the problems I generated in
 slacking off?"

This is slipshod and a very bad attitude.

> If people have a bad impression of highly modular code, it is
> because they (or, more likely, their predecessors) have used what
> I call a "Ginsu knife" approach to modular decomposition, whereas
> the correct instrument to use is a scalpel.  If you went to a
> lecture once on modularity but the only guideline you remember is
> "50 source lines," you're going to use split -50 on your
> monolithic source files, and not garner any of the benefits of
> good modularity.  If, on the other hand, you learn how to
> modularize well, you just won't believe how easy your life (with
> respect to software development and maintenance, anyway) will
> become.

Lo!  Another correct statement!

> There are a few applications (and, some believe, a few
> architectures) for which function call overhead is significant,
> but they are infrequent, and in general there should be
> absolutely no stigma attached to a function call.  It's usually
> easy to back a function call out later, if you have to.

Look at the clock-tick time on a proc call and push/pops on the 8086.

> >:     o   Avoid bit fields, most especially signed ones.
> >Try: don't use them at all, they tend to be nonportable.
> 
> This is a side question: what's so unportable about bitfields?
> Sure, the layout (left to right or right to left) isn't defined,
> but that's only a problem if you're trying to conform to external
> layouts, which is a "problem" with structures in general.  (The
> solution is neither "don't use bitfields" nor "don't use
> structures" but "don't try to portably conform to external
> layouts.")  The ordering could also be a problem if the code
> internally depended on it in some way, but this is no more or
> less a problem than programs which depend on byte ordering
> within words.

Why use bitfields at all if they are simply internal representations?
The insertion/extraction overhead far outweighs any memory advantages.
If not, you should re-examine your basic data structures more closely.

> Are there substantial numbers of real (not toy) C compilers that
> either don't implement bitfields, or that don't implement them
> correctly?  ("Correctly" as defined by K&R and ANSI, not by some
> program that is trying to use them nonportably.)

Yes.

> >:     o   Use realloc instead of malloc/free pairs. Use calloc instead of
> >:         malloc followed by zeroing each member.
> >Efficient memory allocation is *much* more difficult than this, what
> >you really need to do is to cut down on the number of calls to malloc
> >and free. Of course, that usually means writing some kind of
> >allocation stuff yourself, never an easy task.

Agreed.  I would add, however, that if you don't really need a calloc, using
a malloc with a multiply is more efficient in that it does not have to
generate a loop (or some other code) to clear the newly allocated area.

> Please don't avoid malloc; the alternative is generally
> fixed-size arrays and "lines longer than 512 characters are
> silently truncated."  Please don't roll your own allocation
> routines, again unless you have to; this is the kind of low-level
> dirty work, hard enough to do well, that you should let the lower
> levels (i.e. the standard implementations of malloc) give it
> their best shot.

I can name 5 "real compilers" off the top of my head whose in-line expansion
of memory allocation or whose standard library routines are broken.

> >: Micro-efficiency gets regularly denounced. Here are some counterarguments:
> >: o       If your commercial product is slower than the competition, you
> >:         won't get a chance to take advantage of the maintainability/
> >:         portability advantages because you'll be out of business.
> >Or if it is larger. This discussion has focused on making programs
> >faster, but making them smaller is also relevant.
> 
> I agree, and find code size far more frequently relevant in
> practice (among other things, because of the aforementioned
> pdp11).

Not to mention the Altos/Harris/SCO/MicroSoft medium-is-my-largest-model
8086 and 80186 Xenix's (Xenixi?).

> I might also point out that the marketplace generally rewards the
> product that comes out first, so if you can compress your
> development schedule by using clean, straightforward design and
> eschewing time-consuming and bug-prone efficiency tweaking,
> you may come out ahead (and sew up your market share and your
> pocketbook by releasing a speeded up version 2 six months later).

I point at Lotus and snicke at the above paragraph.

[discussion of memory model limitations]

> While these considerations are real, they should be viewed as
> unfortunate and temporary limitations which will be resolved,
> hopefully soon enough, at the root level, and not something which
> we resign ourselves to working around forever, and which we get
> so used to working around that the workarounds become part of the
> language, persisting well beyond their need.  These are what Fred
> Brooks calls "accidental problems," and every programmer minute
> and line of code spend catering to them is not being spent on
> real problems.

Yes, but there are too many machines with these problems (unless you know
something I don't about UCB coming out with new UNIX 7 and UNIX 7 compiler
revisions :-).

> I'll third the motion, but without the tweaks.  My proudest code
> is that which is not only small and fast and elegant and portable
> and modular and extensible but also patently obvious in function
> to the casual observer

A casual observer has no place in a programming shop.

> But it does mean that if you unilaterally tell a reasonably green
> programmer to use pointers instead of arrays, he'll spend months
> having lots of frustrating problems with pointers, and likely end
> up hating them and the language in general.

Or he or she will learn how to use them.  I would rather have a programmer
unwilling to learn something they don't know (although they'd damn well better
know pointers; they're basic!) quit or be fired.

Sorry to seem so hard on Steve, but the ideology embodied in his statements
and statments that "the optimizer will take care of it" is so antithetical
to programmerness that I had to say something.

| Terry Lambert           UUCP: ...{ decvax, uunet } ...utah-cs!century!terry |
| @ Century Software        OR: ...utah-cs!uplherc!sp7040!obie!wsccs!terry    |
| SLC, Utah                                                                   |
|                   These opinions are not my companies, but if you find them |
|                   useful, send a $20.00 donation to Brisbane Australia...   |
|                   'I have an eight user poetic liscence' - me               |