Binary I/O on stdin/stdout?

Rahul Dhesi dhesi at bsu-cs.UUCP
Thu Mar 31 00:35:57 AEST 1988


In article <3221 at haddock.ISC.COM> karl at haddock.ima.isc.com (Karl Heuer) writes:
[re mode change]
>(Where do MSDOS and VMS fall?)

Since MS-DOS was designed by obtaining a weighted harmonic mean of CP/M
and UNIX, it behaves as follows.

In UNIX tradition, the operating system itself does not distinguish
between text and binary files.  In the tradition of CP/M, however, each
line in a text file is terminated by a CR LF sequence, but this
convention is enforced by programs that use text files (including
utilities supplied with MS-DOS) and not by the file system.

When a C program reads a text file, the runtime library routines strip
out any CR characters, thus reading the same sequence of characters
that would be read on a UNIX system.  When writing a text file, a CR
character is added before each LF is written.  (Note:  The correct way
to perform the newline conversion would be to use a state machine.
Then isolated CR characters, which occur in files that include
overprinting controls for line printers, would not be stripped out.  It
is not clear if any runtime library actually goes to this trouble;
seeks would disturb any state information maintained anyway.)

Thus the mode of a file (whether text or binary) exists only as a mode
bit in the runtime library of a C program.  The file mode is therefore
relevant only to a C program and not, for example, to an assembly
language program or a BASIC program.  Common C libraries supply
a macro or a function that will change between text and binary
modes for open files.

Oh, by the way, many programs that were originally written for CP/M and
ported for MS-DOS use a control Z character in text files to mark
end-of-file.  MS-DOS utilities will recognize the control Z and do
weird things with it, but they do not themselves write a control Z to
mark end of file, since MS-DOS, like UNIX, takes the simple (yet
unusual) point of view that a file ends wherever it ends.  The MS-DOS
console driver, if operating in cooked mode, freaks out when a control
Z is written to the console, and refuses to write any more until the
EOF condition is cleared.  (Can you imagine, a console driver with an
EOF condition occurring during *output* to a video terminal?)  This is
a bug, though it was probably originally meant to be a feature that
would prevent trailing garbage from appearing on the screen when an old
format text file was typed.

VMS C does not normally need any mode change for open files.
The default type of a file opened from a C program is stream-LF, which
uses records terminated by linefeeds, and does not distinguish between
text and binary formats at all, acting like UNIX and POSIX files.
Unfortunately, when a C program is executed, stdout is by default not a
stream-LF file, and it's not clear that this can be changed by the
user.  Since I/O redirection under VMS is rather painful, it will
seldom be the case that a program will want to do a seek on its
standard output anyway.  Even when the standard output of a C program
is redirected to a file, that file is opened for output in some weird
mode which is not stream-LF.  (I consider this a bug.)  The type of
this file cannot be changed while it is open.  This is why problems
occur, and presumably why running my "Hello world" program under batch
caused strange behavior.
-- 
Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!dhesi



More information about the Comp.lang.c mailing list