Bug in Lex

John Pierce jwp at sdcsvax.UUCP
Thu Jan 16 09:33:07 AEST 1986


Lex is in some instances improperly handling expressions involving trailing
context.  The problem is demonstrated by the rule "ab?/[\nb]":

	%{
	%}
	%%
	ab?/[\nb]	{printf("test: yytext=%s\n",yytext);}
	.		{printf("dot: yytext=%s\n",yytext);}
	\n		{printf("newline\n");}
	%%

Given the input "abc", this produces:

	test: yytext=ab
	dot: ytext=c

This is incorrect.  "ab" matches "ab?", but "c" does not match "/[\nb]"; thus
the rule cannot be matched that way.  "a" matches "ab?", and "b" matches
"/[\nb]", but then the output is wrong since input that matches the trailing
context part of a rule is not supposed to be part of yytext for that rule.
Thus, what should be produced is:

	test: yytext=a
	dot: yytext=b
	dot: yytext=c

This problem is known to exist for 4.{1,2,3beta}BSD VAXen, Sun 2.0, Pyramids,
Celerities (4.2), and V.2 3B20s.  No Version 7 or earlier systems were tested.

I do not have a fix for this.  The problem *looks* as though the code in
/usr/lib/lex/ncform is at fault somewhere around the loops

	while (lsp-- > yylstate) {
		...
		...
		while (yyback((*lsp)->yystops, -*yyfnd) != 1 && lsp > yylstate)
		...

but it's unclear to me that that is the case.  Using dbx, I was able to obtain
the correct result by forcing yyback() to cause an extra iteration of the inner
loop.  I believe I have found other [not quite analogous] cases where the '?'
operator coupled with "trailing context" causes incorrect results (I have not
yet thoroughly tested them).  This leads me to suspect the construction of the
state tables in such cases, rather than the ncform code.

			John Pierce, Chemistry, UC San Diego
			jwp at sdcsvax.arpa
			ucbvax!sdcsvax!jwp



More information about the Comp.bugs.4bsd.ucb-fixes mailing list