C not LALR(1) & compiler bugs

Michael Stolarchuk mts at cosivax.UUCP
Thu Jan 30 00:30:20 AEST 1986


> C's grammar is CONTEXT SENSITIVE !?    Can it be ?!
>       To see this, consider this program line
> 
>                        A ( *B );
> 
>     If A has been defined as a typedef name, then the line is a
>     declaration of a variable B to be of type "pointer to A."
>     (The parentheses surrounding "*B" are ignored.)  If A is not
>     a type name, then this line is a call of the function A with
>     the single parameter *B.  This ambiguity cannot be resolved
>     grammatically.
> Doesn't that make you wonder if something is SERIOUSLY wrong with C?
   
First I think its great you found this.  It never occured to me.
If someone were to ask me (before the message) if C were context sensitive,
I would have said no.

Even so, context sensitive is an attribute.  The "goodness" of something has
very little to do to do "context sensitive grammar".   Its not too tough
to make a language context sensitive, seem many of the languages
I use have some level of sensitivity, if not in the grammar then really
in the semantics.  In many cases the lines between syntax and grammar
are not as clear as you think.  Its is possible to migrate some of
the typechecking in some languages from semantics into syntax (sometimes
for convience), and imbed context sensitivity into the grammer.
Alternativly, one could write the C grammar to not be context sensitive
by having a production called "typedecl_paramcall: ...", then perform the
distinction between the two at "semantic-level".  The grammar may be
inconvienient (difficult to construct, maintain, and understand), and that
is usually the impetus.

> Personally, I think that the real fault for my "buggy" compiler
> lies not with the compiler writer, but in the shoddy language design
> that haunts the deep-dark corners of C.  ...

Bugs are bugs.  One definition of production quality code in use today
is "one bug per 1000 lines of code".  The compiler I have here has about 
20 thousand lines, so I would expect about 20 bugs to always exist, no
matter what release.  So if a new release appears, the bugs may have shifted.

Secondly, I don't think it is wise to assosiate the quality of a compiler,
even the quality of any tool, to the function its supposed to perform,
just as no one complains about how correct an instruction set is.  The quality
is associated with the product, not the function.

> ... I mean, is there any excuse
> for the grammar being context sensitive? ...


> ... Or, for that matter, for
> identifiers having only 8 significant characters? ...

These are both implementation details, left to the project team creating
the product.  The constraints on the project are the motivating force.
I may expect (in the context sensitive case) it was impractical to choose
a non context sensitive grammar, for some of the reasons already pointed
out above, or even better ones: the availability of a debugged grammar,
the availability of a base development compiler, etc.  As for the 
character symbol length, you probably don't need to recall bell 5.2 and above,
along with berk 4.0 and above (if I remember right) were the releases to
use variable length symbol names.  If the release of the system you are 
using doesn't have variable length symbols, then it is one of the constraints
of the orjects you are working on.  As as example, it may not even be
wise to move from a truncating compiler to a variable length one if 
lots of code already exists, and someone wasn't extremely careful about
name lengths.

You first observation was a good one.  Many of the other statements
seems to imply some dissatisfaction with the product you are using.
Perhaps its a good time to think about the work you are doing, and
whether or not you are willing to put up with the situation you are in.



More information about the Comp.lang.c mailing list