trigraphs in X3J11

Martin Minow THUNDR::MINOW ML3-5/U26 223-9922 minow at thundr.dec.com
Sat May 28 01:55:00 AEST 1988


In a message to comp.lang.c, fuastus at ic.Berkeley.edu asks what Europeans
use to write C programs.

As you no doubt know, there are about a dozen code positions in "Ascii" that
are reserved for national use.  The C language uses most of these for
syntactic purposes.  X3J11 invented the "trigraph" notation to allow
C programming on European terminals without the (current) kludge of
interpreting, say, "upper-case A-umlaut" as "left square bracket".

The problem only occurs for terminals that are limited to a single
seven-bit ISO-646 based character set.  EBCDIC terminals, and terminals
that conform to the newer ISO 8859 (Latin-1) or that are compatible with
Dec's VT200 series can use a coherent 8-bit character set that permits
C programming in its current form without loss of national characters.
Central to this is operating system support for 8-bit characters.  Some
operating systems (and utilities) assume that the eighth bit is free
for "flagging" which causes problems.

Although ISO 8859 is the best base for future programming, it should be
noted that non-ISO workstations such as the IBM PC, the Atari St and the
Macintosh support a mixture of national letters and the ISO invariant set.

The only problem, then, is caused by "old-style" terminals combined with
seven-bit limited operating environments.  At the time trigraphs were
proposed, these were fairly common.  They are much less common now,
and are quickly being replaced by ISO-compliant terminals and workstations.

Imagine if C were being standardized in, say, 1974, when there were very
few terminals that supported lower-case:  one could well imagine a kludge
to allow mixed case programming on monocase terminals.  One such kludge
was, in fact, provided in the Unix operating system.  It finds little,
if any, use today -- and you would have to search carefully to find
an upper-case only terminal.

Because of the speed of conversion to ISO-8859 (and similar 8-bit
environments), coupled with ambiguities in the definition of trigraphs,
I recommended in my comments to the standard that they be dropped.
The committee rejected my arguments, but I would hope they reconsider
before release of the standard.

Martin Minow
minow%thundr.dec at decwrl.dec.com

PS: there was some question of "American Chauvism".  For the record,
I have a European university degree, and worked as a programmer in
Europe for ten years.

The above does not represent the position of Digital Equipment Corporation



More information about the Comp.lang.c mailing list