file descriptor vs. file pointer closing

Lee Carver lee at ssc-vax.UUCP
Sat Aug 13 09:49:43 AEST 1988


Why should file descriptor closing neccesarily close the file
pointer?  Especially when there are more then one file descriptors
associated with the file pointer.  The following is ~180 lines of
discussion.

--- The plan

I was trying to build a nice, prompting, validating input reader for
adding to shell scripts.  The idea was to run this program (we'll call
it readtkn), and have it read and validate the user's input, then
write the result to stdout.  Managed to build something more useful
then not.

Typical usage might be:
   
   # this file is 'demo'
   set src=`readtkn 'Enter source file > ' opt opt opt`
   set dst=`readtkn 'Enter destination > ' opt opt opt`
   cp $src $dst

Obviously, expand this to your heart's content.  Presumably, the
options in the above example constrain the user's input to valid file
names, etc.

--- The problem

Unfortunately, we ran into serious problems during testing, and any
other time that a script using readtkn is sent a file of responses
instead of reading the user's terminal.  We test our package by
running the scripts with known answers, and verifying the expected
behavior, along the lines of:

   # this file is 'test'
   demo << EOF
   model
   /tmp/model
   EOF
   diff model /tmp/model

The second execution of readtkn finds an end of file.

It seems that when the first execution of readtkn terminates, it
closes the file descriptor (exit semantics).  Thus, when the second
readtkn runs, it is handed the file descriptor of a closed file, and
read EOF.

My understanding of the UNIX process and file structure tells me that
each file descriptor is associated with a file pointer.  When readtkn
is run (by fork), it creates a new file descriptor to the original
file pointer.  Thus, if either the child (readtkn) or parent
(shell/demo) change the file pointer (seek, read, etc.), the other one
is affected.

The problem is the close, which is called automatically for every open
file descriptor on exit.  The close eliminates the file pointer, even
though there are two file descriptors associated with it.

It seems to me that this should not be so.  The file pointer that the
first readtkn closes is shared by a file descriptor in the shell.  Now
it must close its descriptor/connection, but why should it cause the
file pointer to be closed as well?

In fact, one might be inclined to argue that the file pointer is
"owned" by the parent, not the child.  So what is the child doing
closing the file pointer that it does not own?

Why does this work at all if stdin is /dev/tty?  Apparently, the shell
reopens stdin if it is closed by a child process, but I'm not sure.

--- The proposals

Clearly, there are programs that rely on these semantics.  So we
cannot change the semantic of the close, at least in the normal case.
That eliminates the proposal that only the "owner" of the file pointer
can close the file pointer.

The next alternative is new fcntl option to mark a file descriptor as
"don't close on exit".  This is somewhat similar to the F_SETFD option
to "close on exec".  Personally, I don't like this, since I'm not sure
I could manage it.

My feeling is that a "new" close call should be provided
(disconnect?).  The semantics of disconnect would be to close the file
pointer only when the last file descriptor is disconnected.  An
equivalent variation would be a fcntl option to activate these
semantics on close could be added.

A problem with these proposals is the interaction of stdio.  If the
input actually read by readtkn is less then the amount pre-fetched by
stdio, we need to clean up.  The un-read but pre-fetched bytes need to
be restored to the file.  Perhaps an lseek to the last delivered byte
is all that is needed.

Hopefully I'm missing something obvious.  If not, or you have a better
idea, send me mail.

--- The workaround

On our system at least (see disclosure), we were able to get things to
work by wrapping a shell script around readtkn in the following style:

     # this file is 'readtkn', a wrapper for the program readtkn.exe
     if test $AUTOMATED then
	read scrap
	echo $scrap | readtkn.exe $*
     else
     	readtkn.exe $*
	endif

This does work, but seems quite inelegant.  Also, it limits you to
full lines in readtkn (no single character reads).  Also, you have to
have explict knowledge of nesting because of the AUTOMATED variable.
Without that, interactive users don't get their prompts until after
they supply the correct answer (sigh).

--- The disclosure

This was actually done in full flower on an Apollo system with their
proprietary Aegis operating system, and proprietary "/com/sh" shell.
After complaining to them about this weirdness, I discover that it
also happens on un-adulterated UNIX (well BSD 4.3).  So, if the
problem statement isn't exactly UNIX-ese, I'm sorry.

These sample programs have been run, with the indicated results, on
ssc-bee, a BSD 4.3 VAX-11/785.

My plan is to tell Apollo how I'd like to see it fixed.  Since it
seems broken on UNIX too, maybe we can all benefit.

--- The details

So, now we conclude with the actual file sources.  The file names
should be clear.  The body of each file is indented three spaces, and
each file is terminated with the line ' *** EOF ***'.  

test driver script:
   sample << EOF
   AB
   EOF
   *** EOF ***
 
sample, the script called from above:
   readtkn
   readtkn
   *** EOF ***

results of running the test driver:
   A
   --- END OF FILE ---
   *** EOF ***
 
desired results, if stdin stayed open:
   A
   B
   *** EOF ***
 
readtkn.c, the source of readtkn:
   #include <stdio.h>
   
   main ( argc, argv )
   int argc;
   char **argv;
   {  int ch;
   
      ch = getchar ();
   
      if ( ch == EOF )
         puts ( "--- END OF FILE ---" );
      else {
         putchar ( ch );
         putchar ( '\n' );
         }
   
      exit (0);
      }
   *** EOF ***

--- The signature

Please mail me your comments and suggestions.  I'll summarize what
comes in.  Thanks.

Lee Carver
Boeing Aerospace

csnet:  lcarver at boeing.com
uucp:   {...}!uw-beaver!ssc-vax!ssc-bee!lee



More information about the Comp.unix.wizards mailing list