Generating code for the switch statement

Tony Williams asw at tony.UUCP
Wed Aug 20 21:26:32 AEST 1986


In article <840 at edison.UUCP> jso at edison.UUCP (John Owens) writes:
>
>In article <610 at hropus.UUCP>, ka at hropus.UUCP (Kenneth Almquist) writes:
>> If the cases are not consecutive, the branch table must be filled with
>> entries for the "missing" cases, making the table larger.  [....]  If
>> the cases are mostly consecutive with a few outliers, a branch table
>> could be augmented with a few specific tests for the outliers, but I
>> don't know of any compiler that does this.
>
>The Microsoft Pascal/Fortran compilers, at least some of the old
>ones (3.04 and 3.11) do this.  I haven't tried it to see if their
>wonderful new code generators do.
>

I am not sure that this is original, as I think it has been used in
BCPL compilers and the like for ages.
I implemented this in a C compiler for an unusual machine (a Three Rivers
Perq running the CMU Accent operating system, with our own Unix emulation
layered on top).  See comments following for a discussion of the advantages.

The algorithm is roughly as follows (barring 
off-by-one errors - my favourite):

genSwitch( caseLabels, addresses, labelCount )
    int *caseLabels;		/* array of label values */
    <mumble> *addresses;	/* array of addresses */
    int labelCount;		/* number of labels */
{
    int labelRange = caseLabels[labelCount-1] - caseLabels[0];
    if( labelRange <= acceptable_sparseness * labelCount )
		/* suitably dense */
	genJumpTable(  caseLabels, addresses, labelCount )
    else
    {

	/* too sparse - generate some tests
	 * first find some suitable median value of label,
	 * say one that divides the table in half for simplicity
	 */

	int median = findMedian( caseLabels, labelCount );
				/* eg labelCount / 2; */
				/* result is place in table, not value */
	genIfStmt( switchExpr, caseLabel[median], less_than );
		
	/* now do the lower half (expr < median) */
	genSwitch( caseLabels, addresses, median );
		
	startElse();
		
	/* now the upper half (expr >= median) */
	genSwitch( &caseLabels[median], &addresses[median], labelCount-median );

    }
}

This algorithm has several nice features:
  - the sparsesness threshold can be tuned without hacking code
  - the use of recursion for each portion of the divided switch
    means that locally dense clusters in a sparse switch can each have their
    own jump table, reached by test-and-jumps
  - the tests for out-of-range expressions can be factored in to the
    algorithm,
  - the choice of division point can be made more intelligently if you
    wish (eg by maximising the density of one or both portions),
    by only changing findMedian()
  - non-dense cases are effectively handle by binary search,
    coded as a jump tree.  This approaches optimal, if there is
    no information on dynamic frequencies.

The generated code (subject to findMedian) is usually within one extra test
of being optimal.  Not bad for a simple-minded algorithm.

As a footnote, the virtual machine architecture had no indirect jump
instruction, so the same technique (and code) was used here by Trudy Watson
in implementing FORTRASH computed goto's and (shudder) assigned goto's.
---------------------------------------------------------------------------
Tony Williams					|Informatics Division
UK JANET:	asw at uk.ac.rl.vd			|Rutherford Appleton Lab
Usenet:		{... | mcvax}!ukc!rlvd!asw	|Chilton, Didcot
ARPAnet:	asw%vd.rl.ac.uk at ucl-cs.arpa	|Oxon OX11 0QX, UK



More information about the Comp.lang.c mailing list