C source character set

Martin Minow minow at mountn.dec.com
Tue Oct 3 06:29:54 AEST 1989


In article <1302 at gmdzi.UUCP> wittig at gmdzi.UUCP (Georg Wittig) writes:
>[1] There exist editors that allow you to enter any ASCII character. Consider
>    the following program fragment:
>
>		/* in the following lines let @ be the character '\0' */
>		int x;
>		x = 1 +	/* foo @ bar */
>		    2	/* */
>		    ;
This is probably a "quality of implementation" issue (because of NUL's
specific use in C to terminate strings.  A good implementation ought to
sweep out such characters (my opinion). More interesting is whether the
'@' can stand for one of the national letters in the ISO Latin-1 alphabet
(these have values from 0xA0 to 0xFF).   Again, "good" implementations will
allow characters in comments, 'char' and "string" constants that aren't
in the C source alphabet.

>
>[2] Furthermore, there are (non-UNIX) operating systems that encode the end of
>    a source line by the number of bytes of that line instead of inserting a
>    newline character

fgets() should encode these lines as "string\n" -- how it would treat an
embedded \n is a quality of implementation issue.  I would suggest that
there should be no difference between an explicit \n and one generated
to signal an end-of-record.

> I can think of at least 5
>    different ways to process such a crazy macro.

>[3] Line continuation by `\'

May occur anywhere (ignoring trigraphs).  Thus "terribly_lon\
g_identifier" is legal anywhere.

Martin Minow
minow at thundr.dec.com



More information about the Comp.std.c mailing list