ANSI draft interpretation questions

Chris Torek chris at mimsy.umd.edu
Mon Jan 8 19:12:26 AEST 1990


[me: 	(%n is a conversion, but is not an assignment; ...]

In article <11897 at smoke.BRL.MIL> gwyn at smoke.BRL.MIL (Doug Gwyn) writes:
>No, %n involves both a conversion and an assignment, but no input action.

Except that it is not counted in the return value.  (Neither are
suppressed assignments, but those are `suppressed assignments', not
`assignments'.)  %n, then, is a conversion and an assignment, but cannot
be called an assignment because it is not counted as an assignment.
This is the sort of thing that causes confusion as to whether `%*n'
suppresses the (already not counted as an assignment) assignment.

>>	If the input has the form
>>		<opt-sign>0x
>>		<opt-sign>0X
>>	and the conversion is either `i' or `x', the sign (if any) and
>>	the zero are consumed; the `x' or `X' remains unconsumed.

>Right (assuming that there is no hex digit immediately following the x).

Oops, I meant to say `<opt-sign>0<hex-indicator><non-hex-digit>'.

Anyway, if the implementation of *scanf() uses lookahead to handle
scanning, it needs at least three bytes of lookahead.  If it uses
pushback (which is legal but not required), the implementation must
provide at least its own three bytes of pushback plus one more.
Mine uses a combination of lookahead and pushback: it looks at the
first remaining character in the buffer, and consumes it if it
appears to be valid.  If it later discovers that it was not, there
might be one character of lookahead around, and two characters
consumed that need to be pushed back; in this case, both are pushed
back, and the implementation further guarantees at least one more
pushback.

Incidentally, it is not clear to me whether the standard requires
the following to work.  (The important line is marked with -> on the
left.)

	#include <stdio.h>
	#include "h_defs.h" /* for H_VALUE values */

	/*
	 * Assume `stream' is open to a read stream on which
	 * the next few input characters are either
	 * `h<optional space><integer>' or perhaps `hello'.
	 * If the format is `h<integer>', stuff the value into 
	 * the given h_value pointer and return 1.  Leave *h_value
	 * unchanged otherwise.
	 *
	 * If there is an h, but it is not followed by a space or a digit,
	 * leave the h and what follows it unconsumed.
	 */
	int find_h_value(FILE *stream, int *h_value) {
		int c, v, n, r;

		c = getc(stream);
		if (c != 'h') {
			/* nb: ungetc(EOF) fails; this is desired */
			(void) ungetc(c, stream);
			return (HV_NO_H);	/* no `h' */
		}
		if ((r = fscanf(stream, " %n%d", &n, &v)) == EOF) {
			/* must have been an input failure: conk out */
			return (HV_H_WITH_EOF);
		}
		if (r == 1) {
			*h_value = v;
			return (HV_WITHVALUE);	/* got an h value */
		}
		/* r must be 0 */
		if (n == 0) {
			/* there was no white space: put back the `h' */
->			(void) ungetc('h', stream);
			return (HV_UNCHANGED);	/* input stream unchanged */
		}
		/* there were spaces, so we may not be able to put back
		   the `h'; return a code saying `keyword h found, followed
		   by something not an integer' */
		return (HV_H_WITH_UNKNOWN_TEXT);
	}


If n is zero, we know the scanf() did not consume any characters.  We
may therefore be required to allow the `h' to be pushed back.  I am not
sure.  Consider an implementation similar to the old Unix one, however,
in which one fills a buffer whenever a `getc' (or equivalent) is done
on an empty buffer.  Here we might have the following:

	A. buffer is nearly exhausted: it has one `h' left
	B. program does a `getc', which returns 'h': buffer now empty

(at this point, ungetc('h') will work.)

	C. program calls fscanf() which calls __vfscanf(), which starts
	   the ` ' directive, needs to skip spaces, and therefore refills
	   the buffer
	D. __vfscanf() finds an `e' (from `hello', perhaps) and stops
	   skipping spaces
	E. __vfscanf() executes `%n' directive, which stores 0 in n
	F. __vfscanf() tries to execute `%d', finds an `e', and stops
	   with a matching failure (returns 0)

At this point, there is probably no room in the input buffer to push
back the `h'.

Then again, the description for `ungetc' does not indicate that any
`getc' must be done in advance.  It says that one character of pushback
is guaranteed.  Perhaps this is meant to imply that

	FILE *foo = fopen("foo", "r");
	if (foo == NULL) die();
	(void) ungetc('a', foo);

is guaranteed to push back an `a', so that the first getc(foo) returns
'a'.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris at cs.umd.edu	Path:	uunet!mimsy!chris



More information about the Comp.std.c mailing list