Report on ISO POSIX meeting of June 1990

VLD/VMB gwyn at BRL.MIL
Sun Jul 8 13:20:29 AEST 1990


From:  Doug Gwyn (VLD/VMB) <gwyn at BRL.MIL>

[ This was originally written as a letter to Dominic, but Doug
agreed it would make a good comp.std.unix posting.  -mod ]

While I don't have any real problem with your use of quotations from
my net posting, I do have a couple of comments on other things you said:

	The ballot also produced a number of suggestions in the area
	of internationalization, such as how to handle (and indeed,
	how to refer to) wide, or multi-byte, characters.

For 1003.1, this is pretty straightforward.  The C requirements on such
character encodings are such that mbc strings can still be handled as
uninterpreted NUL-terminated arrays of char.  In the default "C" locale,
a certain minimum set of characters must be represented, which permits
the construction of portable filename strings.  Even in the "C" locale,
other characters are permitted, so for example a command-line argument
containing "funny characters" can be used directly as an argument to
open() etc.  I know that there are various vendor approaches that make
locales more visible to the operating system, but after all this is UNIX
we're talking about, and one of the main lessons of UNIX is that the
operating system can be designed to be happily oblivious to the uses to
which people put the information that it manages according to simple rules.

I first got involved in "internationalization" issues when I attended a
BOF meeting at which the "expert" who was giving the presentation was
explaining how complex the character set issues were, and when I said
that I didn't see any inherent complexity was berated for my naivety.
Years later, after studying the issues and conversing with the folks
actively working in the field, I still maintain that simple solutions
are possible.  Unfortunately, vendors such as H-P started out with
complicated schemes and have continue to think in those terms.  This
rubbed off on X3J11 when the multibyte character approach was adopted,
which has the obvious problem that anyone programming for an
international environment MUST change from traditional use of C strings
to mbc arrays in his applications.  The Japanese recognize this as an
essential feature of their "long char" proposal, which X3J11 did NOT
intend the mbc approach to be -- however, the fundamental need for
library support using any such approach has now led to the Japanese
requesting that such changes be made for the ISO C standard.  I think
the arguments I used for my alternative proposal to address these very
concerns are being borne out, in spades.

	Returning to the matter of the programming language used for
	bindings, it is true that AT&T-derived UNIX implementations
	prefer a diet of C data types.  However, it certainly was an
	aim of 1003.1 to allow hosted POSIX implementations, which
	might well be riding on underlying operating systems with
	entirely different tastes.

To the contrary, we discussed this very matter in 1003.1 and decided
that, while we did not wish to preclude layered implementations, we
would not make any compromises to accommodate them.  Very definitely
our goal was to develop standards for genuine UNIX variants, not to
provide a "Software Tools" style of Portable Operating System evironment.

We used the same argument when we decided that NFS was simply going to
have to be ruled non-compliant.  UNIX applications rely on certain
semantics of the file system that NFS did not properly support, and we
decided that it would be a disservice to UNIX applications to remove
the requirement that these useful semantics be preserved.

Volume-Number: Volume 20, Number 115



More information about the Comp.std.unix mailing list