possible operator precedence bug?

Thu Oct 13 02:34:00 AEST 1988

karl at haddock.ima.isc.com writes:

>>>	M[Z]=Z[<hairy expression> ? <expression>, <expression>, "_." : " |"];
>>
>>Operator precedence only comes into play when there is ambiguity.  Here
>>there is no ambiguity - the above can only be legal C when parsed one
>>way, so there is no need to turn to operator precedence.

>This turns out not to be the case.  For example, "a + b=0" and "a+++++b" are
>both illegal, even though they could have been legal if parsed/scanned as
>"a + (b=0)" and "a++ + ++b".

Come on, Karl, you can do better than that.  You have to distinguish
between constructs that are incorrect *syntactically* (i.e., there is
no sequence of productions in the C grammar that can generate the
string) from those that are incorrect *semantically* (type
incompatibilities, expressions without lvalues on the left of an equal
sign, and so on).  You also have to look at the tokenized version of
the strings -- what the parser sees after the lexer is done with them.

For the first case you mention, 
	a + b = 0 ,
there is an ambiguous parse: one can interpret the statement as
	(a + b) = 0 ,
or as
	a + (b = 0) .
The precedence rules *do* apply, and resolve the ambiguity to the
interpretation,
	(a + b) = 0 , 
since + takes precedence over =.  Then we find that the statement,
while correct syntactically, is incorrect semantically, as (a + b)
does not have an lvalue.

In the second case, `a+++++b,' we note that lexical analysis is
defined to match the longest token scanning from left to right; the
tokenized version of the string is therefore
	 a ++ ++ + b .
There is only one parse for this string of tokens, so precedence
doesn't come into play:
	((a ++) ++) + b .
Once again, the resulting interpretation is syntactically correct, but
has a semantic error; the inner ++ operator postincrements a, but
(a++) has no lvalue, so the outer ++ operator gets a semantic error.

The case of
	a ? b , c : d
has no such problem.  There is no lexical confusion, particularly when
the spaces that I show are supplied.  There is no ambiguity in the
parse; the only syntactically correct interpretation is 
	a ? (b , c) : d .
Then, as long as c and d are type-compatible, and a can be coerced to
an integral type, the expression has a meaningful interpretation
semantically.  (The ice is thin if b and c are type-incompatible --
does anyone know if the latest dpANS is clearer on this point?)

Kevin Kenny			UUCP: {uunet,pur-ee,convex}!uiucdcs!kenny
Department of Computer Science	ARPA Internet or CSNet: kenny at CS.UIUC.EDU
University of Illinois
1304 W. Springfield Ave.
Urbana, Illinois, 61801		Voice: (217) 333-6680