modification of strings

Guy Harris guy at auspex.UUCP
Sat Feb 18 04:58:34 AEST 1989


>Rumour has it that sscanf modifies strings passed as a first argument
>on at least some machines (e.g. some suns?).

"Some" Suns?  Yeesh, "_doscan" isn't one of the machine-dependent
modules; the same source is used on *all* Suns.

In fact, the same source is used on a bunch of non-Sun machines as well;
the SunOS 3.2-3.5 version is based on the S5R2 version, the SunOS 4.0
version is based on the S5R3 version, and the version in SunOS releases
prior to 3.2 is based on the 4.2BSD version, which is probably based on
the V7 version.  The bug exists in S5 releases from AT&T, as well as
4.xBSD.

The problem is that "*scanf" - or, to be precise, "_doscan" and the
routines it calls, which are the "guts" of the "scanf" routines in many
implementations - uses "ungetc".  All very well and good when you're
doing I/O to a file; "ungetc" stuffs the ungotten character back into
the I/O buffer.  However, the way "sprintf" and "sscanf" work in many
(most?) UNIX C implementations is that it turns the string in question
into a "funny" I/O buffer; however, most "ungetc" implementations don't
understand this, and try to stuff the character back into the "buffer"
anyway, which means they try to modify the string.

>Well, it doesn't actually modify the contents,

Which, in this particular case, is, I think, true; the character being
stuffed back is a character that's just been "read" from the string.

>but the compiler doesn't know that.

It's not the compiler that has to know that; it's "ungetc".  In
"comp.bugs.4bsd" this very "sscanf" bug is being discussed; one
suggested fix is to have "ungetc" check whether the character it's
stuffing back into the buffer is the one that is in the buffer and, if
so, just back up the buffer pointer and count.



More information about the Comp.lang.c mailing list