Faster C

Chris Torek chris at mimsy.UUCP
Sat Feb 6 16:13:51 AEST 1988


>In article <473 at aati.UUCP> fish at aati.UUCP (William G. Fish) writes:
>>I wish to make the following C code run as fast as possible under 4.1 BSD 
>>on a VAX 11/750.

declarations:
>>    register short *in;
>>    register float *out;
>>    register c;
>>    int C, S;
>>    register sample, s;
>>
loop:
>>    for (s = 0; s < S; s++, c += C) {
>>	out[s] = sample = in[c];	/* short to float conversion */

In article <4177 at june.cs.washington.edu> pardo at june.cs.washington.edu
(David Keppel) writes:
>If you change the out[s] and in[c] to use a pointer that is incremented
>each iteration, you may be able to save yourself an ashl each time.

The `out[s]' and `in[c]' should generate VAX `subscript' mode instructions,
something like

	cvtwl	(in)[c],sample
	cvtlf	sample,(out)[s]

and indeed, feeding the equivalent through /lib/ccom produces

	cvtwl	(r11)[r9],r7
	cvtlf	r7,(r10)[r8]

A pointer version might be a wee bit faster for `out':

	for (s = 0; s < S; s++, c += C) {
		*out++ = sample = in[c];

or

	cvtwl	(r11)[r9],r7
	cvtlf	r7,(r10)+

One more tiny gain is to loop down to zero instead of up to S:

loop:
	for (s = S, out += s, c += C * s; c -= C, --s >= 0;) {
		*--out = sample = in[c];
		...

If the loop is short enough (8 instructions or less), the
optimiser (/lib/c2) will turn the decrement/test/branch into
a `sobgeq' instruction.  It looked as though the loop was
not that short.  Still,

	decl	rN
	bneq	loop

will be ever so slightly faster than

	incl	rN
	cmpl	rN,-S(fp)
	blss	loop

on a 750.  Since in the original code fragment `c' did not count
up from zero, we still need a counter like `s'.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris at mimsy.umd.edu	Path:	uunet!mimsy!chris



More information about the Comp.unix.wizards mailing list