name=value or -n value?

Wed Mar 21 17:39:51 AEST 1984

I am impressed with the suggestions for standards which concentrate on
the name=value pair system.  While this is contrary to UNIX tradition,
it seems more consistent and allows multi-character names for options.
My own support for such a standard has changed over time and I will tell why.
At first I rebelled against the traditional "convention" for the more
clear name=value convention, but with some convincing by some analysis
of usage and some results of implementation, I have come to respect both.
I think the attempts at replacing the current "convention" with another
is an over-reaction.  What is needed is a consistent standard that gets
followed; the lack of any (or software supporting it) has resulted in poorly
implemented programs that give people grief they don't deserve.  I think
that the recently proposed system (mostly consistent with tradition)
is as good as any, and that poor and inconsistent implementations are to
blame for the problems so many have observed.

One of the problems with UNIX commands is that the use of command line options
is inconsistent.  Some of the `rules' to which there are numerous exceptions are:
	Options are preceded by a dash (-)
	Options can be bundled
	Options are single characters
	Option letters must be immediately followed by a value
	Flagged options must precede others
	Commands without arguments print their options (dangerous)
The exceptions are everywhere:
	ps (some versions) does not allow a -
	nroff does not bundle flags
	stty takes multi character options
	cc option values must be preceded by a space (cc -o file)
	nroff values must follow option flags (nroff -man is really the "an" macros!)
	dd uses name=value format
	find has brain damaged syntax
	cc doesn't care about where options to the loader go

Looking over these, one has to say: "If there were ONE way to do it,
things would be better,"  and "What way is the best way?"  Anyone
can see that UNIX command line arguments are a mess, but what
to do about it is more difficult to determine.  In the 1984 Winter USENIX
conference, Hemenway and Armitage presented a proposed standard for options.
It looked a lot like traditional UNIX command line handling, except that
standards were introduced:
	options are preceded by a dash
	options are single characters
	options taking values must be preceded by a space
	boolean options can be bundled
There are 13 rules, which are pretty easy to follow.
At first glance, I said "Ugh!"  I wanted to see name=value pairs
to allow multi-character option names.  I almost fully supported Brad
Templeton posting of the Waterloo system (I referred to have ALL args
as name=value pairs so that alphanumeric option names for logical flags
could be specified as name=t or name=false, and + and - would be synonymous
with true and false, respectively so that name+ and name- would be parsable
and fall out of the same name=value convention).  This is a standard on
some systems inside Bell Labs where a common computing environment for
several systems is needed.  I further supported name=value pairs because
I too had written software for parsing them in command lines and files.

But then I learned about the work of Hemenway and Armitage.  They looked
at hundreds of commands on UNIX and categorized the full range of usage
exemplified by the exceptions above.  Their goal was to come up with
recommendations about what to do about UNIX and its command line problems.
Their conclusion was that consistency was highly desirable, but so was
compatibility with existing programs.  Their analysis, which I consider to
be the most sane I have seen, points out that backward compatibility is
not only with command usage by people, but also by shell scripts.  People
at Bell Labs use shell more than anywhere else, especially since ksh,
the new shell by Dave Korn (See Summer USENIX, 1983) is so much faster.
I found it hard to believe, but there are system here with thousands of
lines of shell.  To admit total defeat and say "Scrap the -options for
name=value pairs" would mean that a lot of systems depending on shell scripts
would not work any more.  This would not fare well with management concerned
with giving support to a naive user community; you can't say "Everything is
changed (for the better)" without alienating a lot of people (I cite BSD 4.2).

Hemeway and Armitage were not naive.  They realized that people would not
react well to old commands being changed to fit ANY standard, not matter which.
They have stated that standards would apply to new commands; old ones might
be reworked to make them more robust, but not different.  To maintain good
terms with programmers, they surveyed to find out what people liked about
the - system.  They found that people liked to bundle logical flags:
	ls -ltr
	ps aux
Imagine typing:
	ls +long +temporal +reverse (abbreviated ls +l +t +r)
The amount of typing, using abbreviations, is not much greater,
though any mnemonic enhancements of multi-character options are lost.

Another problem pointed out about name=value pairs is that global file
expansion of the shell might not work correctly (ls file=* passes * to ls)
though the name=value convention could be programmed in the shell so that
anything after = would be expanded.  This would work except that the shell
already has options to take name=value environment variables anywhere on
the command line; little known, but there all along.

At this point, you probably feel as I did.  There seems to be no complete
solution.  For every suggestion, there is an opposite as well founded.
I worked on an enhanced version of getopt (the little known standard
parser for UNIX command lines; little known because it was not released
to the public [To make a standard fail, keep it a secret]).  The main
point of the new parser was to enforce the simple standards of Hemeway
and Armitage.  As an exercise, I rewrote some of my own commands.
I found that my programs had exceptions to every rule, and that few commands
were consistent with others.  After doing the retrofitting, I found that
I could use my own commands more easily than before.  I was impressed
because it demonstrated to me that CONSISTENCY was so important, even with my
own programs.  I was also impressed that the use of "-X value" instead of X=value
no longer seemed to be much of an issue.  When the parser allowed on-line
help phrases for options, the advantage of multi-character option names
seemed less of an issue.

My conclusion is that the syntax does not matter much, so long as there
really is a syntax.  To call the current usage of flag options in UNIX
a "standard" is dangerously mistaken; thus far on UNIX, the exception is the rule.
Given good implementations of a simple rule, whether the rule is -X value
or X=value, or maybe even if the rule were value#X, users should fare about
equally well.  I conclude that conventions like the Waterloo system are
successful because the rule is simple and well implemented, not because there
is any advantage to their particular rule.  This is a hypothesis that is
relatively easy to support with data and I would like to see some or maybe
even collect it myself.  Until then, it seems like the Hemenway and Armitage
analysis and proposal is the most practical because it incorporates simple
standards with the highest degree of backward compatibility.
	Gary Perlman	BTL MH 5D-105	(201) 582-3624	ulysses!gsp