Character types

utzoo!decvax!harpo!eagle!mhtsa!alice!npoiv!houxm!u1100a!sdo utzoo!decvax!harpo!eagle!mhtsa!alice!npoiv!houxm!u1100a!sdo
Fri Apr 22 09:38:37 AEST 1983


Someone stated that characters should be signed so that there could
be a shorter integer type than now.

That's not the solution.  If there should be a shorter integer type
(I agree on this), don't call it "char" which it isn't.  Call it
extra-short, or tiny or something.  As a matter of fact there is a
shorter type of integer called a bit field.  The language considers
them unsigned.  Too bad there can't be arrays of them.

Just so everyone knows how hard it is to deal with characters
sometimes, I'll explain how it is done on UNIVAC 1100's.

Bytes can't be addressed directly since it is a word machine.
Characters are dealt with as quarter-words (9 bits).  There
are instructions for dealing with quarter-words, but one has to
know which quarter to use (q1, q2, q3, or q4).  Once that is
determined from the address of the char, the appropriate
load instruction is performed.  No sign extension is done.
Even if it was, I've seen code that sets the 0200 bit in a char
and then tests it by seeing if the char is negative.  This would
only work on an 8-bit char.

The moral is that characters are not little integers, int's are
not always 16 or 32 bits, and UNIVAC's were designed for
languages other than C.  (This can be debated - not the part about
being good for C, but the part about being designed.)

			Scott Orshan
			Bell Labs Piscataway
			201-981-3064
			{houxm,ihnp4}!u1100a!sdo



More information about the Comp.unix.wizards mailing list