Programming and international chara

mcdaniel at uicsrd.csrd.uiuc.edu mcdaniel at uicsrd.csrd.uiuc.edu
Thu Nov 3 04:08:00 AEST 1988


Written 7:27 pm Oct 27, 1988 by kjartan at rhi.hi.is in comp.emacs:
> Another way of doing this is using "is.." functions that are defined
> in <ctype.h>, an include file that comes with (almost) all C
> compilers.  Some of the above lines would look like this:
> 
> fileio.c:	  if (iscntrl( fn[tel++] ) )
> input.c:				if (iscntrl(buf[--cpos]) ) {
> input.c:				if (iscntrl(buf[--cpos])) {
> 
> This code is better (most of the is.. things are macros that mask the
> argument and return . . . either zero or positive), has more style to
> it and is easier to port to a diffrent character set.

A little while ago, there was a discussion in comp.lang.c about the
"is..."  functoids.  I call them "functoids" because they resemble
functions in use, but may be either functions OR macros.  One possible
macro implementation is:
	#define iscntrl(c)	( (c) >= 0 && (c) <= 037 )

(The first test is because implementations are permitted by dpANS to
have signed characters.)  In this case, if "c" has side effects, the
side effects will be performed twice.

As a minor point, consider this statement from the BSD 4.3 man page
for the "is..." functoids:
                . . . Isascii and toascii are defined on all
     integer values; the rest are defined only where isascii is
     true and on the single non-ASCII value EOF (see stdio(3S)).

All that C guarantees is that a "char" variable can hold all the
values in the host character set.  It may be larger, and thus able to
hold more.  Consider, for instance, a computer with 8-bit "char"s but
using 7-bit ASCII.  These functoids may therefore fail if the eighth
bit is set.

Therefore, safer versions of the three lines quoted above would be:
  fileio.c:	  tel++;  if (isascii(fn[tel])   && iscntrl(fn[tel]) )
  input.c:	  cpos--; if (isascii(buf[cpos]) && iscntrl(buf[cpos]) ) {
  input.c:	  cpos--; if (isascii(buf[cpos]) && iscntrl(buf[cpos]) ) {

(Of course, it might be "isjapanese()" instead, but you get the point.)

-- 
Tim, the Bizarre and Oddly-Dressed Enchanter
Center for Supercomputing Research and Development
at the University of Illinoid at Urbana-Champaign

Internet, BITNET:  mcdaniel at uicsrd.csrd.uiuc.edu
UUCP:    {uunet,convex,pur-ee}!uiucuxc!uicsrd!mcdaniel
ARPANET: mcdaniel%uicsrd at uxc.cso.uiuc.edu
CSNET:   mcdaniel%uicsrd at uiuc.csnet
DECnet?: GARCON::"mcdaniel at uicsrd.csrd.uiuc.edu"



More information about the Comp.lang.c mailing list