Trigraphs: Is sed sufficient? (was: Re: __STDC__ defined as zero a problem)

Marshall Cline cline at suntan.ece.clarkson.edu
Sat Jul 1 05:39:30 AEST 1989


In article <1989Jun27.164758.1379 at utzoo.uucp> henry at utzoo.uucp (Henry Spencer) writes:

>In article <2029 at dataio.Data-IO.COM> bright at dataio.Data-IO.COM (Walter Bright) writes:
>>	1. Trigraph support significantly slows down the scanner, which is
>>	   the most time-consuming part of a compiler. Trigraphs are useless,
>>	   and so are left out of the Useful C mode.

>It's not necessary for trigraphs to be in the scanner at all, provided the
>implementation supports them *somehow* (a sed script is what I'd use) for
>official conformance.

Ah TriGraphs.  Henry's comment about using "sed" is interesting.  But is it
true that trigraphs change the contents of strings literals??.  Example:
"Is this a trigraph --> ??."  What is printed by:
		printf("Foo ??. bar ??; baz ??? barf ??$");
(I don't even know if the ".;?$" are valid endings for trigraphs, but you
get the idea...)

If these nasty little fellers are gonna chomp down on my existing C code and
munge my string literals, I'd like to know about it!

But the real point of me posting is: "sed" is _only_ appropriate if trigraphs
are expanded _WHEREVER_ they appear (including inside strings, in char
literals, etc, etc.  Otherwise the regular expression support in sed isn't
powerful enough to parse a Context Free Grammar such as the BNF _syntax_
for ANSI-C.  Recall that parsing a CFG requires a Push-Down Automata, which
is strictly more powerful than any Finite Automata.  (_Semantic_ aspects such
as whether variable names are declared and/or are of compatible types are
issues which can't even be resolved by a PDA; they require at least a Context
Sensitive Grammar, and probably a full Turing Machine).

Marshall
--
	________________________________________________________________
	Marshall P. Cline	ARPA:	cline at sun.soe.clarkson.edu
	ECE Department		UseNet:	uunet!sun.soe.clarkson.edu!cline
	Clarkson University	BitNet:	BH0W at CLUTX
	Potsdam, NY  13676	AT&T:	315-268-6591



More information about the Comp.lang.c mailing list