Safe coding practices (was Re: Bug in users command)

Wed Jan 30 02:31:42 AEST 1991

In article <22921 at well.sf.ca.us> Jef Poskanzer <jef at well.sf.ca.us> writes:
>In the referenced message, Bob Manson <manson at cis.ohio-state.edu> wrote:
>}You think 1000 users is a large number in a users program? Suppose I
>}decide to start recording all users over a large network in my utmp
>}file? (Wouldn't that be nice...how I hate rwho.)
>
>Yes, that might be nice... but if you did that, why would you want to
>run "users"?

Well, I probably wouldn't want to _look_ at it, per se. But...

>Unix programs dumping core on such input (not even a "recompile me"
>message, how rude), then maybe I'll consider it worthwhile to add the
>malloc gunk.

The argument of "everything else is busted, so I'll leave my program
broken too" isn't a real good one, but I do see a point there.
Hmmm...Lets see what dies on an 13K input line. (Sun SLC+ running
SunOS 4.1.) Well, tr was happy to convert the spaces into newlines,
and I don't see much reason to go further, as the output of that could
be postprocessed as I wished. (Yes, most unix utilities puke badly on
input lines > 2K. On this Sun, grep and egrep deal with it OK,
producing correct output, but sed silently truncates the output to
4001 bytes. The behavior of grep & egrep is atypical, but I'll bet tr
will work in any case.)

>In general, sure, handling arbitrary input is great.  In specific cases
>where you can make a confident estimate of the maximum input size, I have
>no problem at all with using checked fixed size arrays of ten times

I've had to deal with one too many utilities where someone makes a
"confident estimate of the maximum input size" only to find that it's
too small. Assuming that someone would never have more than 2048
password entries, for example. OK, I question strongly whether most
unix sites have 2500 entries in their password files. Ours did (when I
worked for the CIS dept. here), and I didn't have source to the
programs. I was hosed.

Seeing messages from programs like "recompile program with larger
NENTS" is useless in these cases, as all I can do is call {insert your
workstation maker here} and say "I need program X recompiled with a
larger NENTS" and they laugh. And not everyone who does sysadmin is
even capable of recompiling programs; the people I'm currently working
with couldn't if their life depended on it.

>What is the precise meaning of "far too small"?  At least one system
>where 1000 is too small?  We probably have that already.  But if you
>mean that such systems will be common, sure, I'll take that bet.  How
>much?

What does "common" have to do with anything? If your utility won't work at
my site, what good is it?

>I give source.  In fact, one reason I like code which prints messages
>like "change XYZ and recompile me please" is to discourage bozos from
>doing any god damned binary-only distributions of *my* source.

Hasn't stopped HP or AT&T from distributing code with similar limits.
Won't stop anyone else either. 

I know what you're trying to say. It's a useless waste of time to
write extra code to make a program limit-independent when we can make
a good estimate of the maximum numbers & provide source for
recompilation. My argument is, it really doesn't cost that much more
to design the program properly to function without limits. The cost in
making utilities with fixed limits in them is unhappy customers & time
spent rewriting programs, since I seriously doubt source policies will
change anytime soon. Your point about utilities dying on too long
input lines is an excellent example; really, there is no much thing as
a "too long input line". Whoever wrote sed decided that lines would
never be longer than 4000 characters, and they were quite wrong...

>  Jef Poskanzer  jef at well.sf.ca.us  {apple, ucbvax, hplabs}!well!jef

						Bob
manson at cis.ohio-state.edu