C source character set

Georg Wittig wittig at gmdzi.UUCP
Tue Oct 3 01:31:58 AEST 1989


May be the follwing are RTFM questions, but I don't have the ANSI C papers;
Harbison & Steele II don't seem to cover it ...

My questions are about the legal characters in a C source programme:

[1] There exist editors that allow you to enter any ASCII character. Consider
    the following program fragment:

		/* in the following lines let @ be the character '\0' */
		int x;
		x = 1 +	/* foo @ bar */
		    2	/* */
		    ;

    Is this program fragment equivalent to

	[a] ``int x; x = 1 + 2;''
	    In this case C compilers cannot use ``fgets'' to read the source
	    lines.

    or  [b] ``int x; x = 1 +   ;''
	    This will result in a syntax error message in later compiler
	    phases.

    What about a '\0' outside a C comment? Does it terminate the current line
    or must it be kept so that a syntax error message will be the result?

    What about a '\0' in a string constant?

[2] Furthermore, there are (non-UNIX) operating systems that encode the end of
    a source line by the number of bytes of that line instead of inserting a
    newline character (\x0a or \x0d in ASCII, \x15 in EBCDIC) at the end of
    that line.
    As an example, the line ``abc'' could be encoded as ``\3abc'', and not as
    ``abc\x0d''. In those environments ``[f]getc'' must generate an artificial
    '\n' character at the end of the line. Or am I mistaken?

    What if exactly this artificial '\n' is also a character of the line?
    What is a ``line'' in this context?

    Consider a (perverse looking) macro like the following:

			/* in the following line let @ be the character '\n' */
		#define X(a,b)	foo@#define X(a,b) ((a)+(b))
		i = X(27,38);

    Is this required to pass the preprocessor phase without an error message,
    and if so what is the output of that phase? I can think of at least 5
    different ways to process such a crazy macro.

[3] Line continuation by `\': Does it only apply to #define contexts and string
    constant contexts, or is it a general rule? Example:

		int terrible_long_identifier;
		terrible_lon\
		g_identifier = 1;

    Does the assignment statement alter the value of that terrible long
    variable, or is it a syntax error (``terrible_lon'' and ``g_identifier''
    undeclared)?

Thanks in advance,
-- 
Georg Wittig   GMD-Z1.BI   P.O. Box 1240   D-5205 St. Augustin 1 (West Germany)
email: wittig at gmdzi.uucp   phone: (+49 2241) 14-2294
-------------------------------------------------------------------------------
"Freedom's just another word for nothing left to lose" (Kris Kristofferson)



More information about the Comp.std.c mailing list