ANSI draft interpretation questions

Doug Gwyn gwyn at smoke.BRL.MIL
Tue Jan 9 06:51:34 AEST 1990


In article <21690 at mimsy.umd.edu> chris at mimsy.umd.edu (Chris Torek) writes:
>[me: 	(%n is a conversion, but is not an assignment; ...]
>In article <11897 at smoke.BRL.MIL> gwyn at smoke.BRL.MIL (Doug Gwyn) writes:
>>No, %n involves both a conversion and an assignment, but no input action.
>Except that it is not counted in the return value.  (Neither are
>suppressed assignments, but those are `suppressed assignments', not
>`assignments'.)  %n, then, is a conversion and an assignment, but cannot
>be called an assignment because it is not counted as an assignment.
>This is the sort of thing that causes confusion as to whether `%*n'
>suppresses the (already not counted as an assignment) assignment.

The Standard says, "... the fscanf function returns the number of input
items assigned ...".  %n does not correspond to an input item.  That %n
is not included in the returned assignment count is also stated
explicitly under the description of the n conversion specifier, just to
make sure that there is no question about this.  (Generally we avoided
redundancy in the specifications, but some was added to reduce apparent
ambiguity as evidenced by comments received during the public review.)

>Anyway, if the implementation of *scanf() uses lookahead to handle
>scanning, it needs at least three bytes of lookahead.  If it uses
>pushback (which is legal but not required), the implementation must
>provide at least its own three bytes of pushback plus one more.

Yup.  Some of us would have been happy to eliminate the non-string
forms of *scanf() as well as ungetc(), to avoid having stdio deal with
pushback.  Since only a limited amount of pushback is guaranteed, and
then is subject to rigid constraints, it really isn't very useful for
real-world tokenizers, macro processors, etc. which must implement
their own scheme anyway.  However, there was a strong minority that
insisted that ungetc() was important to them, and at the time we thought
it highly desirable to obtain unanimous approval for sending the draft
out for (the first) public review, so the notion of pushback remained
in the draft stdio specs.  Later there was little sentiment for
revisiting this issue.

>Incidentally, it is not clear to me whether the standard requires
>the following to work.  [example program omitted for brevity]

The Standard does guarantee that you can push back one character with
ungetc() ('h' in the example program).  As you have noticed, the old UNIX
implementation of stdio pushback does not conform to the Standard.
Perhaps you can get Dave Prosser or some other AT&T implementor to
explain how they dealt with this in SVR4, which is advertised as Standard
conformant.

>At this point, there is probably no room in the input buffer to push
>back the `h'.

Definitely some additional "slop space" must be provided, somehow.
I've heard it said that two bytes of slop is required if fscanf()
uses getc()/ungetc(), although four seems to me to be necessary
(without thinking very hard about it).

>Then again, the description for `ungetc' does not indicate that any
>`getc' must be done in advance.  It says that one character of pushback
>is guaranteed.  Perhaps this is meant to imply that
>	FILE *foo = fopen("foo", "r");
>	if (foo == NULL) die();
>	(void) ungetc('a', foo);
>is guaranteed to push back an `a', so that the first getc(foo) returns
>'a'.

It is true that ungetc() need not be preceded by getc() (or other stdio
input function).  Also, as shown, it is permissible to push back a
character even at the beginning of the stream (fpi becomes sort of
indeterminate as explained in the Standard).  A POSIX-conforming
implementation must treat text and binary streams indistinguishably,
which means that after reading back the pushed-back character the fpi
must again have the value 0.



More information about the Comp.std.c mailing list