Ctype.h (start arrays at 1 then add 1 before looking up)

matt at UCLA-LOCUS.ARPA matt at UCLA-LOCUS.ARPA
Wed Mar 28 20:58:01 AEST 1984


From:            Matthew J. Weinstein <matt at UCLA-LOCUS.ARPA>

	Date: 25 Mar 84 17:58:07-PST (Sun)
	To: Unix-Wizards at Brl-Vgr.ARPA
	From: decvax!mcnc!ecsvax!bet at Ucb-Vax.ARPA
	Subject: Re: Ctype.h (start arrays at 1 then add 1 before looking up)

	Article-I.D.: ecsvax.2189

	In a lexical analyzer I wanted translate tables for values returned by
	getchar() -- including EOF (-1). I wanted them FAST. So I created arrays
	like this:

	struct
	{
		char dummy;
		char class[128];
	} character=
	{
		/* list of 129 values for characters, starting with EOF */
	};

	My reasoning was as follows: members of a structure of homogeneous composition
	(no alignment problems) occupy consecutive locations in memory. C, god bless
	its black-hearted soul, doesn't attempt subscript bounds checking. Finally,
	character.class evaluates to a constant expression at compile time, which
	C compilers can (and my reading suggests they will) simplify at compile time.
	Therefore, I think I have a legal array with subscripts ranging from -1 to 127.
	Anything wrong with this? Shouldn't it be faster than always using array[i+1]
	(or evaluating i+1 into a temporary)? Inasmuch as I explained the trick clearly
	in a comment, I am not interested in arguments like "UGLY" or "confusing".
						Bennett Todd
						...{decvax,ihnp4,akgua}!mcnc!ecsvax!bet

---

I did a bit of experimenting with the following sort of code:

	{
	static char table[129];
	register int i;
	register char *ptr = &lookup[1];
	...
	y = ptr[i];
	...
	}

The generated assembly for this is basically (base,index,dest):

	cvtbl (rB)[rI],rD

(Note that y is an int because register chars don't get to live in registers;
if y is declared as char, the generated stores relative to the FP on the
Vax).

The sequence:

	y = table[i+1]

generates reasonable code too:

	cvtbl Ltable+1[rI],rD

[Of course, if table is allocated dynamically, the first of the two forms
(initializing a pointer) is less expensive, since otherwise table's
offset must be recomputed at each access]

There doesn't seem to be any gain to building a structure in this case.

				- Matt
				matt at ucla-locus
				{ihnp4,ucbvax}!ucla-s!matt



More information about the Comp.unix.wizards mailing list