Programming and international character sets.

Guy Harris guy at auspex.UUCP
Tue Nov 1 05:09:16 AEST 1988


>There is a Cyrillic version (I think it is 8859/2)

No, 8859/2 is another Latin set; there are four Latin alphabets
(8859/[1234], I think), and there seem to be at least drafts for Greek
and Cyrillic.

>The only time when I've wanted to do this is when stripping off a parity
>bit, and using 0xFF would be totally wrong.  The toascii() macro *might*
>be appropriate.  When you're dealing with a 7 data + 1 parity bit device,
>there is no point in pretending that you're prepared to accept anything
>other than 7 data bits.

Except that most devices can be *told* to handle 8 bits; never assume
that when you're dealing with a terminal that you're dealing with a 7
data + 1 parity bit device (unless your software deals *only* with one
specific terminal that *can't* generate 8 bits).

>The real problem is trying to write portable code that uses character
>classes which _aren't_ in <ctype.h>.  Consider isvowel()...

Or, for that matter, consider "toupper()"; what's "toupper()" of a
German "ss" (or is it "sz") character?



More information about the Comp.lang.c mailing list