wchar_t values

harkcom at spinach.pa.yokogawa.co.jp harkcom at spinach.pa.yokogawa.co.jp
Thu Apr 11 10:15:29 AEST 1991


In article <1107 at sranha.sra.co.jp> erik at srava.sra.co.jp
   (Erik M. van der Poel) writes:

 =}EUC is the name of the scheme, while UJIS is the name of the Japanese
 =}EUC. UJIS is not a wchar_t encoding.

   Though the term EUC is used as the name of an encoding scheme, it is
also the name used for the multibyte encoding of the JIS standard using
SS2 and SS3 single shifts. UJIS is the name used to refer to the 2 byte
encoding of the EUC scheme JIS standard. The 2 byte (4 byte on HP) wide
character encodings for Japanese are usually UJIS...

 =}You're probably referring to the European characters with the 8th bit
 =}up. These are not relevant in this discussion since the ANSI C wchar_t
 =}spec explicitly refers to the basic character set, which does not
 =}include these European characters.

   But my point was that if you have a single byte wide character using
all 255 characters, it would be a dsability to require that the multibyte
encoding and the wide character encoding be unequal. This does seem to be
relevant to this discussion...

 =}Keld is referring to the problem that I brought up in the first
 =}article in this thread. I.e. 10646 'c' does not have the same numeric
 =}value as ASCII 'c'.

   OK, let me get this straight. The numeric value of multibyte 'c' does
not have to equal the numeric value of wide character 'c' under ISO 10646.
You feel that this is a problem because you then become unable to use
such things as:  ('c' == L'c') or ('c' == ((char )L'c'))... Or, in
other words, comparisons of ASCII characters in the mb format with the
equivalent in the wc format can not be done so simply. My question is,
is it so important to be able to do such comparisons that we should limit
the encodings allowed for wide characters? The comparisons of mb to mb and
wc to wc are legit either way...

Al



More information about the Comp.std.c mailing list