Bourne-Shell and fork() (was Re: K-shell variables & Do-loops)

Thu Feb 22 01:35:14 AEST 1990

In article <720013 at hpcljws.HP.COM> jws at hpcljws.HP.COM (John Stafford) writes:
>And beware, if you redirect the input or output of the loop as a whole, you
>won't be able to get the variables out at all (as the loop will be executed
>by a child shell).

It should be noted, that this is the regular behaviour of the Bourne
Shell too (not only "ksh"). As a general strategy the Bourne Shell
seems to avoid forking as long as "it get's not too complicated
without a fork". This behaviour is found on some "older" constructs
(like {}-command grouping and loops) but not on some "newer" ones
(like I/O-redirections on internally executed commands and shell
functions).

This can be confusing sometimes. Consider the following situations:

( cd somewhere; morestuff ) # sh forks, working directory doesn't change
morecommands                # for process executing morecommands

( cd somewhere; morestuff ) > whatever # same as above, working directory
morecommands                # doesn't change .....

{ cd somewhere; morestuff;} # sh doesn't fork, working directory *changes*
morecommands                # for process executing morecommands

{ cd somewhere; morestuff;} > whatever # sh now forks!! working directory
morecommands                # doesn't change for process executing morecommands

So be careful! Adding redirection to some shell constructs can change
the semantics of these constructs, because the shell does a fork for
something it would do without a fork if you ommit the I/O-redirection.
There are some other pitfalls with forking vs. non-forking of a new
process by the shell. Consider the following:

v="initial value"
v="new value" cmd # v is set to "new value" only in the environment of cmd
nextcommand $v	  # and $v expands to "initial value" here

This is well-known behaviour. But what, if cmd is executed internally?
I think you guessed it -- now $v expands to "new value" when executing
nextcommand.  The following fragment of a script makes use of this:

x=external
x=internal pwd >/dev/null
echo "pwd is an $x command in this version of the shell"

Finally let's come to shell functions (which are, of course,
executed internally). What do you think will happen in case of:

f() { somestuff $v; }
v="initial"
v="new" f

You may choose among these possibilities for the value of $v
when/after executing f: (1) initial/initial, (2) initial/new,
(3) new/initial and (4) new/new. What is your guess?
If "f" were a true internal command the answer would be (2).
If "f" were an external script the answer would be (3).
IMHO also (4) would make some sense, because functions are
executed internally and "v" may be set to the new value prior
to the execution of "f".

The implementors of shell functions have choosen (1) - don't ask
me why. Of course, problems only occur in rare situations, but I
would have appreciated a more 'predictable' approach. Interrestingly
enough, the shell doesn't fork if it executes a function - even
if you redirect I/O for the function. So you can put a loop in a
shell function and redirect I/O for the function call, if you need
to get variables out of the loop body.

BTW: I would not be surprised, if the behavior of {}-command grouping
and loops with redirection will change in the future ... so don't
depend on it.
-- 
Martin Weitzel, email: martin at mwtech.UUCP, voice: 49-(0)6151-6 56 83
-- 
Martin Weitzel, email: martin at mwtech.UUCP, voice: 49-(0)6151-6 56 83