What's a C expert?

Henry Spencer henry at utzoo.uucp
Sun Aug 13 11:51:37 AEST 1989


In article <4724 at alvin.mcnc.org> spl at mcnc.org.UUCP (Steve Lamont) writes:
>>Having the sign of chars be undefined allows the implementation to be as
>>efficient as possible with respect to converting between chars and ints.
>
>Huh?  Are you telling us that the standard *allows* such a horrible
>thing?  Aaaaaaarrrrrgh!  :-+ (<-- smiley sucking on a persimmon)  I
>thought the standard was supposed to clarify things, not confuse the
>issue...

I see it's time to repost my commentary on this from some time ago:

-----------
All potential participants in this debate please attend to the following.

- There exist machines (e.g. pdp11) on which unsigned chars are a lot less
	efficient than signed chars.

- There exist machines (e.g. ibm370) on which signed chars are a lot less
	efficient than unsigned chars.

- Many applications do not care whether the chars are signed or unsigned,
	so long as they can be twiddled efficiently.

- For this reason, char is intended to be the more efficient of the two.

- Many old programs assume that char is signed; this does not make it so.
	Those programs are wrong, and have been all along.  Alas, this is
	not a comfort if you have to run them.

- The Father, the Son, and the Holy Ghost (K&R1, H&S, and X3J11 resp.) all
	agree that characters in the "source character set" (roughly, those
	one uses to write C) must look positive.  Actually, the Father and
	the Son gave considerably broader guarantees, but the Holy Ghost
	had to water them down a bit.

- The "unsigned char" type exists (in most newer compilers) because there
	are a number of situations where sign extension is very awkward.
	For example, getchar() wants to do a non-sign-extended conversion
	from char to int.

- X3J11, in its semi-infinite wisdom, has decided that it would be nice to
	have a signed counterpart to "unsigned char", to wit "signed char".
	Therefore it is reasonable to expect that most new compilers, and
	old ones brought into conformance with the yet-to-be-issued standard,
	will give you the full choice:  signed char if you need signs,
	unsigned char if you need everything positive, and char if you don't
	care but want it to run fast.

- Given that many compilers have not yet been upgraded to match even the
	current X3J11 drafts, much less the final endproduct (which doesn't
	exist yet), any application which cares about signedness should use
	typedefs or macros for its char types, so that the definitions can
	be revised later.

- The only things you can safely put into a char variable, and depend on
	having them come out unchanged, are characters from the native
	character set and small *positive* integers.

- Dennis Ritchie is on record, as I recall, as saying that if he had to do
	it all over again, he would consider changing his mind about making
	chars signed on the pdp11 (which is how this mess got started).
	The pdp11 hardware strongly encouraged this, but it *has* caused a
	lot of trouble since.  It is, however, much too late to make such
	a change to C.
-----------
-- 
V7 /bin/mail source: 554 lines.|     Henry Spencer at U of Toronto Zoology
1989 X.400 specs: 2200+ pages. | uunet!attcan!utzoo!henry henry at zoo.toronto.edu



More information about the Comp.lang.c mailing list