#! (was Re: file attributes)

Fri Jun 28 03:07:23 AEST 1991

This is largely a religious debate (and not even a particularly
important one at that) which I've been through many times, so I won't
say anything more after this unless someone says something new.

Guy Harris <guy at auspex.auspex.com> writes:
|>Actually, if I were about to change the semantics of a prominent UNIX
|>call, I would probably have given it a new name,
|
|The whole *point* of the change was to make it *transparent* to existing
|programs!

You left out the part where I said that *now* I would indeed make it
transparent.  At the time it was done, I don't think the existing
software base was so large that changing those cases that wanted to be
able to run shell scripts would have been unreasonable, and it would
have given programmers a choice of whether they wanted to exec objects
or run programs (see Statement of Religion, below).  This is unrelated
to whether or not it belongs in the kernel.

|>You could also get rid of the ugly hard-coded limits that
|>are in kern_exec.c;
|
|E.g., the 1MB hard-coded limit on number of characters passed as
|arguments to a program? :-)

I didn't say you could fix *all* the ugly hard-coded limits, :-) just
the 29 (32 - 3 for #! and \n) bytes for shell + 1 arg (and the just one
arg, if you wanted to).  This is also largely unrelated to whether or
not it belongs in the kernel, although the more complex the
implementation, the less likely you are to want it in the kernel.

Elsewhere, in <19407 at rpp386.cactus.org>, John F Haugh II
<jfh at rpp386.cactus.org> writes:

|I'm no fan of bloat either, and I rail against it at every oppurtunity.

Well, obviously not *every* opportunity. :-)

|The "#!" "hack" is not "bloat".  As Guy (that was Guy Harris, right?)
|pointed out, the change is really very minimal.

I don't think it's big, I just think it's in the wrong place, whether
it's one line or one thousand.

[ Flameproof suit on ]

I wasn't going to say anything on this, but since everyone keeps
quoting the word ``hack'':

I called it a ``hack'' because I felt that it was a feature motivated
by wanting new functionality with minimal code change (ie. work), not
thought out very clearly (were the security problems with setuid
scripts considered?), and as such the code is certainly not something
of which I would be proud, and is not a complete solution.  I'm
obviously guessing at the motivation and depth of design, but all I
have to judge is the end result (security problems, incomplete
solution, and ugly code).

Why do I not consider it a complete solution?  It only works with
interpreters that ignore the magic line themselves (most do, so it's
convenient), it requires an explicit path be provided in the script
for the interpreter (makes it easy for a 60 line kernel
implementation), and it does not allow flexibility in how the
interpreter is invoked (again makes for easy implementation).
Wouldn't it be nice to have been able to say something like:

/* #! myrexx -v -f !0 -lmylib !*
 */

where /* and */ are the REXX comment delimiters, #! still signifies an
``exec'' like string, and the !  substitutions are csh-like?  I
haven't thought this out completely myself, but it seems possible and
not that difficult.  This is getting _way_ off topic.

As to why I think the code is ugly, consider the 32 char buffer which
limits you to 29 characters for shell plus one arg, unless the byte
following the ex_shell[] array happens to be '\0', in which case you
get 30 characters, but maybe only part of your shell name or argument
with no error.  Besides, it uses a goto. :-) I know there's lots of
ugly code in everybody's kernel but ``everybody's doing it'' is rarely
if ever a good reason for anything.

There are valuable hacks, useful hacks, ugly hacks, even brilliant
hacks, but they're all hacks.  I'm not against ``good'' hacks, but I do
like to recognize them for what they are.

[ Flameproof suit off ]

|It gets made in exactly
|one place (the kernel) instead of many others (every command that might
|include a library module which executes another command) and brings with
|it certain (dubious) advantages (like set-UID scripts ...)

I also think it should be made in exactly one place (the library).
Since, as both of you have noted, it is quite small, it does not
``bloat'' every command that might want to execute another command in
pre-shared library systems, and with shared libraries, takes exactly
as much space as it does in the kernel.

Stdio adds much more bulk, including code to format floating point
numbers, into every program that uses printf(3).  I don't think anyone
would suggest that it belongs in the kernel to avoid ``bloating''
applications.

|> [ getting around setuid scripts with an auxiliary program ]
|
|This points directly to why it should be handled in the kernel.  We know
|exactly how to execute shell scripts, it isn't that hard, and we can
|do it right in the kernel with 60 lousey little lines of code.  [ Plus a
|few to close the set-UID holes if you really insist on set-UID scripts ]

Above you noted that this was a ``dubious'' advantage.  Also, to my
knowledge, the holes exists, and there's nothing a sysadmin can do
about it.  That is, not only is it dubious, it's unavoidable.  With an
auxiliary program to run setuid scripts, the situation is under
control of the sysadmin.  Statement of Religion: New features,
particularly those of dubious merit and/or with security concerns,
should be optional.

Actually, I'm not convinced a few lines would close the security
holes.  They would, I presume, fix any problems in the kernel (I've
never really looked at what these might be), but some of the problems
are related to the fact that scripts are difficult to control.  Not
only must you have faith in the interpreter, but in every program it
invokes (this is less of an issue in largely self-contained
interpreters such as awk and perl than in ones such as sh and csh).
This, however, is getting off topic (again :-) ).

I'll give you some more ammunition, though:  with the kernel
implementation you can exec a script that you don't have read
permission on.  This is, presumably a fairly rare case, since in order
for the *interpreter* to open and read the script, it would need
appropriate permissions.  If you really wanted to handle this case in
the library, you could presumably use the same auxiliary program that
handles setuid scripts.

To reiterate, I know no *clear* reason why #! should be in the
kernel.  It needn't be there for transparency (put it in the library),
or size considerations (it's not big), or even the far-from-clear for
need for setuid scripts.  I'd be happy to hear other new reasons, or
carry on ranting by email if anyone is interested.