Difference between char and unsigned char
Karl Heuer
karl at haddock.ima.isc.com
Thu Jul 19 09:58:49 AEST 1990
In article <34292 at ut-emx.UUCP> ycy at walt.cc.utexas.edu (Joseph Yip) writes:
>I know char represents 7 bits ASCII and unsigned char works with 8-bit.
Not quite. `char' is an arithmetic type which is at least eight bits wide,
but it's implementation-defined whether it's signed or unsigned. For normal
use in text processing, you shouldn't need to know the integer value of a
character, so `char' is sufficient.
The unfortunate exceptions are that the return value of `getc()' and the
argument to a <ctype.h> function are a bastard type: instead of the logically
correct `char', they use the union of `unsigned char' and { EOF }.
Now, since all normal% characters are contained within the intersection of
`char' and `unsigned char', you can safely ignore this botch if you *know*
you're dealing with the most restrictive kind of text.
>If I pass a unsigned char pointer to a function that expects a char
>pointer ... will there be a problem? Will [it] mask off my 7th-bit?
No. At worst you'll need to use an explicit cast, but I believe the Standard
contains a clause to guarantee that the behavior is as you expect.
My recommendation is to always use `char *' for text, and do conversions to
`unsigned char' only in the context of <ctype.h> functions.
>You know I hate writing the same system library functions where the
>only difference is the 7th-bit.
I don't see any need. Save your energy for a *real* problem, like wchar_t.
Karl W. Z. Heuer (karl at kelp.ima.isc.com or ima!kelp!karl), The Walking Lint
________
% Besides being true of all ASCII characters, this guarantee is also extended
to the entire C source character set in non-ASCII alphabets. Basically this
forbids an EBCDIC implementation from making `char' a signed 8-bit type.
More information about the Comp.lang.c
mailing list