Signed char - What Foolishness Is This!

Brent Chapman chapman at cory.Berkeley.EDU
Sun Oct 19 09:31:02 AEST 1986


In article <8719 at duke.duke.UUCP> jwg at duke.UUCP (Jeffrey William Gillette) writes:
>MSC 4.0 defaults 'char' to 'signed char'.  

[ it defaulted to 'unsigned char' in previous versions of MSC -- Brent]

	[ details relating to a gotcha in header files, because Microsoft
	  didn't cast a (possibly) negative char value into an unsigned
	  value when using it to index an array, deleted ]

>What possible justification is there for this default?  Is not
>'char' primarily a logical (as opposed to mathematical) quantity?  What
>I mean is, what is the definition of a negative 'a'?  I can understand 
>the desirability of allowing 'signed char' for gonzo programmers who
>won't use 'short', or who want to risk future compatibility of their
>code on the bet that useful characters will always remain 7-bit entities.

This brings up some interesting questions and ambiguities concerning
K&R's definition of C.  I haven't seen the proposed ANSI standard, so
I can't comment on it.  But K&R will do to illustrate the ambiguities;
perhaps someone else can point out if and how the proposed standard
deals with them up.

On page 34, K&R define a 'char' to be "a single byte, capable of holding
one character in the local character set."  On page 40, they say "The
language does not specify whether variables of type char are signed or
unsigned quantities."  This seems to imply that the implementor is free
to choose the default that he feels best suits his implementation.  On
most machines, this is a moot point, since most machines only use the
0 to 127 range for character values, which is available regardless of
whether the char is signed or unsigned.  On the PC, however, it _does_
make a difference, because the upper 128 characters of the PC's character
set _are_ printable, and are numbered from 128 through 255.  Logic would
seem to indicate the 'unsigned char' is the reasonable choice for the
default on a C compiler for the PC.

Unfortunately, most other C implementations, especially UNIX C implemetations,
seem to default char to 'signed'.  (Note that I've been assured of this
by knowledgeable sources, but don't have any first hand knowledge, so I could
be wrong.)  This is a reasonable choice because, in the original K&R
C definition, there is no 'signed' keyword.  Therefore, everything should
default 'signed' because if it defaults 'unsigned', there's no way to change
it to 'signed'.  Many implementations now include the 'signed' keyword,
however.  I don't know if it is a part of the proposed ANSI standard,
but I think that it probably is.

Now, Microsoft apparently decided to change their default for chars from
'unsigned', which is what it was in versions of the compiler previous to
Ver 4.0, and which makes sense for a PC, to 'signed', which makes sense
because of K&R's lack of a 'signed' keyword, and because most other 
implementations are that way.  The original poster got bitten because
Microsoft used a 'char' (which could be negative) as an array index, instead
of casting it to 'unsigned char', in one of their library header files.

Perhaps the most general, portable solution is not to use char variables
for counting or array indexing.  If you need a counter, use a short,
which will default signed unless you say otherwise.  If you need an
array index, cast to an 'unsigned char' or an 'unsigned short'. 
Unfortunately, there is no guarantee that a short is as small as a char,
so you may be wasting some space.  Worse, there is no guarantee that a
short is as _long_ as a char, although I doubt there is any
implemetation where this is true.  You currently can't count on whether
a char will be signed or unsigned.  Does the proposed ANSI standard
address this? 

Fortunately, with MSC Ver 4.0, you can have your cake and eat it too.  There
is a command-line option to the compiler that will change the default from
'signed' to 'unsigned'.  I think it's '-J', but I'm not certain, since I'm
at home and my manuals are at work.


Brent
--
Brent Chapman

chapman at cory.berkeley.edu	or	ucbvax!cory!chapman



More information about the Comp.lang.c mailing list