Character handling functions -- Jan 88, dpANS

David H. Wolfskill david at dhw68k.cts.com
Sat Mar 26 07:34:10 AEST 1988


In reading a copy of the 11 January, 1988 dpANS C Standard
(X3J11/88-001), I ran across something with respect to the character
handling routines in the library that I suspect that I do not understand
adequately.

I realize that an attempt is made (in the draft standard) to accomodate
alphabets other than the English one, and that the use of such an
alphabet is not the default (but is specified by selecting a non-default
"locale"; the default locale is the "C" locale).

In section 4.3.1.2, the description of the "isalpha" function reads:

	The isalpha function tests for any character for which isupper
	or islower is true, or any of an implementation-defined set of
	characters for which none of iscntrl, isdigit, ispunct, or
	isspace is true.  In the "C" locale, isalpha returns true only
	for the characters for which isupper or islower is true.

In section 4.3.1.6, the description of the "islower" function reads:

	The islower function tests for any lower-case letter or any of
	an implementation-defined set of characters for which none of
	iscntrl, isdigit, ispunct, or isspace is true.  In the "C"
	locale, islower returns true only for the characters defined as
	lower-case letters (as defined in [section]2.2.1).

In section 4.3.1.10, the description of the "isupper" function reads:

	The isupper function tests for any upper-case letter or any of
	an implementation-defined set of characters for which none of
	iscntrl, isdigit, ispunct, or isspace is true.  In the "C"
	locale, isupper returns true only for the characters defined as
	upper-case letters (as defined in [section]2.2.1).

For the "C" locale, I see no problem whatsoever.  Since this is probably
the only locale I am likely to use, the issue I am bringing up does not
directly affect me; nevertheless, I would like to determine whether or
not my present understanding is shared by others.

I perceive 2 concerns:

1)	It would seem to be possible for a character -- interpreted in a
	locale other than the "C" locale -- to cause isalpha to return
	true, yet cause both isupper and islower to fail to return true.

	Is this both expected and reasonable?

2)	Similarly, it would seem to be possible for a character to be
	able to cause isalpha to fail to return true, and yet cause
	either (or both!) of isupper and islower to return true.

	Likewise, is this both expected and reasonable?

Here is a (partial) list of approaches (assuming that the cited wording
needs to be fixed):

1)	Include "islower" in the "stop list" for "isupper", and vice
	versa.

2)	Specify that a character that causes isalpha to return true must
	cause precisely one of islower or isupper to return true.

3)	Specify that a character that causes either islower or isupper
	to return true must also cause isalpha to return true.

Another approach, of course, would be to explicitly state (perhaps in
the Rationale) that the above-described behavior really is desired.
(Perhaps it's just my provincialism, but this really does seem a bit
unlikely to me.)

I look forward to seeing your comments to the above,
david
-- 
David H. Wolfskill
uucp: ...{trwrb,hplabs}!felix!dhw68k!david	InterNet: david at dhw68k.cts.com



More information about the Comp.lang.c mailing list