Can C default to float? Are there float regs?

Wed Sep 25 23:53:06 AEST 1985

In article <56 at escher.UUCP> doug at escher.UUCP (Douglas J Freyburger) writes:
>The discussion has been about single vs double precision
>floating point as well as keeping them in registers.  It
>was mentioned that there w{{uld be a big speed difference
>on the VAX.
>
>At site "cithep", the Caltech High Enerrgy Physics
>Department, they made a C compiler that used single
>precision for "float".  At first, they got ALMOST NO SPEED
>IMPROVEMENT.  After adding floating point immediate values
>to the assembler produced (instead of storing constants
>like strings and then referring to them by name), they got
>a pretty good improvement.  10-20%.  I don't know if they
>were running a 750 or 780.  Still, for calculations that
>only involve variables and no constants, the difference is
>much smaller than you'd think.

Here is the simple test I used

#include <stdio.h>
main()
{
  register float x=1.1, y=1.0;
  register int i;
  for(i=0; i<100000; i++)
    {
      x = x+y;
      x = x+y;
      x = x+y;
      x = x+y;
      x = x+y;
      x = x+y;
      x = x+y;
      x = x+y;
      x = x+y;
      x = x+y;
    }
}

750/FPA results using 4.2bsd compiler (with "-O")

9.6u 0.6s 0:11 91% 1+3k 1+2io 2pf+0w

750/FPA results using 4.3bsd compiler (with "-O -f" i.e. use single
precision float)

1.8u 0.1s 0:01 101% 1+3k 0+2io 2pf+0w

This is an artificially dramatic result.  The following is more typical.

Here is a summary of the fft times for a program written in C (no assem code)

It uses either single precision (with the 4.3bsd compiler "-O -f") or double
precision (either 4.3bsd or 4.2bsd compilers, they are effectively the same).

The times for single with the 4.2 compiler are longer than double (because of
all the conversions) so it is pointless to list them as an alternative.

Summary of FFT Times:

1024 complex FFTs (in milliseconds) and IFFTs - Single Precision

Vax750 w/FPA (Radix 4):  210 and 210
Vax750 w/FPA (Radix 2):  280 and 290

Vax785 w/FPA (Radix 4):  80 and 90
Vax785 w/FPA (Radix 2):  100 and 110

1024 complex FFTs (in milliseconds) and IFFTs - Double Precision

Vax750 w/FPA (Radix 4):  350 and 350
Vax750 w/FPA (Radix 2):  360 and 390

Vax785 w/FPA (Radix 4):  200 and 210
Vax785 w/FPA (Radix 2):  220 and 230

These are considerable improvements for both 750 and 785 (more than
the 20% quoted above) and for a program that has a typical balance of
control code and floating arithmetic code (typical for signal
processing).