formal language descriptions

Steven Ryan smryan at garth.UUCP
Tue Aug 9 05:51:32 AEST 1988


To all those who have written I know about SIGPLAN. Both issues. Writing
a context free syntax for C is simple.

The hard part is always been the context sensitive syntax and the semantics.
Although a type 0 or type 1 grammar can specify these, an excessive number of
productions are devoted to shuttling the cursor so that the results are
unreadable. Because of that, most definitions have resorted to English.

When the English is concise and controlled, the effect is pretty acceptable,
as in the 16 paged Algol 60 report. But when pointers and cast and abundance
of types are presented, precise English becomes muddled. To keep the definition
within a few thousand pages, a lot of hand waving goes on. For example, Ada
does not specify recursive rules on identifier identification. Rather,
they state that if an ambiguity exists, the name cannot be identified. Sorry,
folks, but the choose operator is hardly total recursive, and, as we might
remember from the good old days of calculus, it is frequently not even
recursive.

A formal context sensitive grammar is a (at least) partially recursive
predicate. So any implementation can be considerred a formal definition,
but the language used in the definition is at least as complex as the
language being defined. While any definition ultimately rests on faith,
I would prefer a definition technique as simple as possible.

Semantics are another hard part. Again most formal semantics definitions
are at least as complex as the language as defined.

>>As far being `unreadable' or less `accessible,' that's too easy a target. As
>>long as we refuse to correct the problems that exists, they will continue.
>
>Quite true.  As long as we refuse to admit that unreadability is a major
>problem that greatly reduces the usefulness of formal standards, they will
>continue to see little use.

The easy retort is that no formal language is readable or accessible. That
includes production systems, PL/I, C, Algol, or a hexidecimal dump.

>                             Even if the authors of a formal definition don't
>fall into the trap of "lapidary" writing (paring out every non-essential
>word that might help the reader make sense out of the formalisms -- a style
>which is endemic in mathematics), the result is seldom something that an
>uninitiated reader can tackle without help.

What uninitiated reader can tackle C?

The trap is to include all the irrelevant detail. Then implementors begin
thinking this is requirement and build what what may have been an irrelevant
comment into the compiler and thereby formalise somebody's whim. Do all
member of union HAVE to have the same offset? Is that part of definition
or is it one of the non-essential words that might help the reader make sense
out of the formalisms?

I don't know what your mathematic experience is, but I do know the concise
and formalism in math is not the math. It's just a way to keep people
honest. It forces the writer to state his conditions and reasoning explicitly
and openly so that other people can critically examine it for flaws.

Come to think of it, maybe there is a reason why nobody likes formal language
definitions.

>                                             Compare this to the appendix
>of K&R, or even the X3J11 drafts, which can be heavy going at times but
>*can* be understood without formal background.

And misunderstood. Read this group. Read your own postings. The people
who don't hedge their answers are the ones who get ripped to pieces.
 
>The attitude that "anyone but an illiterate peasant ought to be able to
>read formal definitions" is part of the problem, not part of the solution.

I'd hate to have an illiterate peasant at a keyboard, although perhaps that why
Unix smells. An unwillingness to face that this is an intrinsically difficult
occupation is part of the problem, not part of the solution.

Additionally, not everybody need read the formal definition. There's a lot
finite math floating around in programming, but how many of us have read
Principia Mathematica? I certainly have no objections to people writing
tutorials and overviews and reference manuals and everything else. I do
object these being purported as THE definition of a language. A language
definition should be complete and concise, sitting on a dusty shelf somewhere.
It is the contract between users and compilers on what is or is not acceptable
and what it must mean. 

>MSDOS is not dead, it just     |   Unix still twitches as well.
>smells that way.               |

By the way, there's a Sampler of Formal Definition in an old, old Computing
Survey. The closing remarks remain relevant.



More information about the Comp.lang.c mailing list