machines with oddball char * formats

Guido van Rossum guido at mcvax.uucp
Thu Nov 20 20:12:58 AEST 1986


In article <1534 at batcomputer.tn.cornell.edu> garry%cadif-oak at cu-arpa.cs.cornell.edu writes:
>Forgive my ignorance, but why don't the compiler writers on these "odd"
>machines just designate a "char" and a "byte" to be the identical width
>to a "short" ?   What will go wrong ?  
>
>(Would very many real-life application programs actually be hurt by the 
>added memory usage? - I'm excluding text editors!)

You'll have to exclude a lot more programs (think of {n,t}roff, and all
sorts of "compiler"-type programs like awk, dc and bc).

Another reason is compatibility with the rest of the world on such
systems.  Text files are usually written in a packed format (using 2
bytes per word if you have 16- or 18-bit words), so stdio or the
read/write system calls would have to do a lot of (un)packing.  String
parameters to the native operating system also have to be converted, of
course -- not an unsurmountable problem for the standard library, but a
real pain in some part of the body for system hackers.  And such system
hackers have always been a big part of the C community!  (May that's
changing now; it certainly was true a few years ago when these compilers
were designed).

On the CDC Cyber I have used a few systems like this.  The BCPL compiler
did not waste an entire word (60 bits!) for a character, but rather
packed 7 ASCII characters in it, rather than 10 Display Code characters
as the standard convention on this machine.  Nice, until you start
reading binary files and occasionally have to extract strings from
them... The Algol-68 compiler, on the other hand, *did* wast a 60-bit
word for a character.  I had to do all my system hacking in assembler or
(gasp!) Fortran.

- - -

But then again, you mention "real-life applications".  I suppose this is
a fairly restricted class of programs, only applying to programs over
10,000 lines of source code, generally dealing with image processing or
statistics...

	Guido van Rossum, CWI, Amsterdam <guido at mcvax.uucp>



More information about the Comp.lang.c mailing list