Difference between char and unsigned char

Karl Heuer karl at haddock.ima.isc.com
Thu Jul 19 09:58:49 AEST 1990


In article <34292 at ut-emx.UUCP> ycy at walt.cc.utexas.edu (Joseph Yip) writes:
>I know char represents 7 bits ASCII and unsigned char works with 8-bit.

Not quite.  `char' is an arithmetic type which is at least eight bits wide,
but it's implementation-defined whether it's signed or unsigned.  For normal
use in text processing, you shouldn't need to know the integer value of a
character, so `char' is sufficient.

The unfortunate exceptions are that the return value of `getc()' and the
argument to a <ctype.h> function are a bastard type: instead of the logically
correct `char', they use the union of `unsigned char' and { EOF }.

Now, since all normal% characters are contained within the intersection of
`char' and `unsigned char', you can safely ignore this botch if you *know*
you're dealing with the most restrictive kind of text.

>If I pass a unsigned char pointer to a function that expects a char
>pointer ... will there be a problem?  Will [it] mask off my 7th-bit?

No.  At worst you'll need to use an explicit cast, but I believe the Standard
contains a clause to guarantee that the behavior is as you expect.

My recommendation is to always use `char *' for text, and do conversions to
`unsigned char' only in the context of <ctype.h> functions.

>You know I hate writing the same system library functions where the
>only difference is the 7th-bit.

I don't see any need.  Save your energy for a *real* problem, like wchar_t.

Karl W. Z. Heuer (karl at kelp.ima.isc.com or ima!kelp!karl), The Walking Lint
________
% Besides being true of all ASCII characters, this guarantee is also extended
  to the entire C source character set in non-ASCII alphabets.  Basically this
  forbids an EBCDIC implementation from making `char' a signed 8-bit type.



More information about the Comp.lang.c mailing list