getchar and EOF (was: One more point regarding = and == (more flamage))

Steve Summit scs at adam.mit.edu
Sun Apr 7 16:40:03 AEST 1991


In article <1991Apr4.215605.2801 at syssoft.com> tom at ssi.UUCP (Rodentia) writes:
>In article <3465 at litchi.bbn.com> rsalz at bbn.com (Rich Salz) writes:
>>In <3555 at inews.intel.com> bhoughto at hopi.intel.com (Blair P. Houghton) writes:
>>>toupper.c:    while ( (int) (c = getchar()) != EOF )
>>The cast implies that c is char.  If so, this line is buggy.
>Does this mean that there if c is char, there is no way to assign the
>getchar and test it for EOF without having it cast down to char?

Yes.

Proper use of getchar is (as it ought to be) simple.
Any recent confusion is an unfortunate but inevitable result
of a particularly absurd line of discussion.

getchar can return any char value, plus the single, "out of band"
value EOF [note 1].  Obviously, a variable of type char cannot
hold any-char-value-or-EOF, so getchar() is specified to return,
and any variable used to hold its return value must be declared
as, an int.

If getchar's return value is assigned to a variable of type char,
or otherwise cast to char, EOF becomes indistinguishable from
some valid char value, usually '\377'.  Mapping two values onto
one is an information-losing transformation, so no amount of
casting back to int, after the fact, can restore the lost
information (i.e. distinguish EOF from that other char value).

The simple rule is, always use int variables to hold getchar's
return value.  By doing so, you almost never have to worry (or
even think) about this issue.

                                            Steve Summit
                                            scs at adam.mit.edu

[note 1]  As a return value from getchar, EOF is guaranteed to
be distinct from all char values.  This is because getchar
essentially returns (as the ANSI C standard explicitly requires
it to; see sec. 4.9.7.1) normal characters as unsigned characters
cast to int (i.e. as positive values, even on a machine on which
chars are usually signed), while EOF is always a negative value.

I was going to say that "EOF is guaranteed not to compare equal
to any char value," but this is not really true.  If you have

	signed char c = '\377';

and EOF is -1, then c == EOF will succeed.  ("signed" is a new
ANSI C keyword; the test also succeeds if c is a "plain" char, on
machines for which plain chars are signed.)



More information about the Comp.lang.c mailing list