Highly Optimizable Subset of C (was: Fortran vs. C for numerical work)

Mon Nov 26 08:11:31 AEST 1990

In article <1990Nov24.201731.3442 at cunixf.cc.columbia.edu>,
shenkin at cunixf.cc.columbia.edu (Peter S. Shenkin) writes:
|> 
|> The difficulty of optimizing C comes from C features (pointers) absent
|> in Fortran.  It has been observed that C programs translated from Fortran
|> using f2c run about as fast as the Fortran versions, which seems to
imply that
|> (1) such translations do not use the problematic C features, and (2) if
|> the probematic C features are avoided, C compilers optimize about as well
|> as Fortran compilers;  in fact, much of the optimization goes on at the 
|> intermediate code level, doesn't it?
|> 
|> Now, many proposals have been made to improve C optimization:  the use
|> of "noalias", #pragmas, and so on.  But the above observations would seem to
|> imply that if the programmer simply restricts him/herself to a Fortran-like
|> "highly optimizable subset" of C, then he/she can expect Fortran-like
|> performance out of any reasonably good C compiler.
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I don't think this is true; it may seem to be true only because of shared
components between C and Fortran compilers from the same vendors.

On many systems where the C and Fortran compilers produce comparably good 
code (for programs written in the same style in both languages), they are
essentially the same compiler.  Either:

    1.  both have poor analysis and great peephole optimizations
        (the Fortran compiler has been adapted from the C compiler), or

    2.  both have great analysis and loop-level optimization
        (the C compiler has been adapted from the Fortran compiler,
         or they have been developed together).

(1) is the case for most "scalar" Unix systems.  (2) is the case for most 
vector and parallel Unix systems (Convex and Stardent, at least).  In the 
case of (2), much of the optimization is done at the source-level, or else
in an intermediate language that still admits loops and array subscripting.

Since the same compiler technology can be, should be, and often is applied
to the intersection of Fortran and C, I think the issue of which compilers 
are better is moot.

Peter's second question is a good topic for further investigation:

|>	(2) Just what is this highly optimizable subset of C? 

Hopefully, it includes some (but surely not all) elements of (C - Fortran). 
Compiler researchers (like me) are trying to enlarge the optimizable subset,
but it would be interesting to learn what current commercial compilers can
deal with.

--Paul