no noalias not negligible (long)

Chris Torek chris at mimsy.UUCP
Sat May 21 14:18:12 AEST 1988


In article <54080 at sun.uucp> dgh%dgh at Sun.COM (David Hough) writes:
>The results for double precision linpack on Sun-4 using SunOS 4.0 and
>Fortran 1.1 were [edited to just `rolled' case]:
>Fortran	1080 KFLOPS
>C		 850 KFLOPS

>      subroutine daxpy(n,da,dx,dy)
>      doubleprecision dx(1),dy(1),da
>      integer i,n

Incidentally, as we have just seen in comp.arch, this Fortran version
is illegal: it should declare dx and dy as

	integer n
	double precision dx(n), dy(n)

>      do 30 i = 1,n
>        dy(i) = dy(i) + da*dx(i)
> 30   continue
>      return
>      end

>The corresponding rolled C code could be written with a for loop
>daxpy(n, da, dx, dy)
>        double          dx[], dy[], da;
>        int             n;
>{
>        int             i;
>
>        for (i = 0; i < n; i++) {
>                dy[i] = dy[i] + da * dx[i];

I suggest

		dy[i] += da * dx[i];

as it is easier to understand.  (In a reasonably optimal C compiler
it should produce the same code.)

>        }
>}
> 
>but [the] Sun compilers ... won't unroll [these] loops.... [Hand unrolling
>helped but not as much as expected.]

>Investigation revealed that the reason had to do with noalias:  [the
>Fortran [version is] defined by the Fortran standard to be "noalias",
>meaning a compiler may optimize code based on the assumption that [dy
>and dx are distinct].

[X3J11's `noalias' proposal was deleted for various reasons including]
>3) optimizing compilers should be able to figure out if aliasing
>exists, which is definitely false in a separate compilation environment
>(unless you want the linker to recompile everything, in which case the
>linker is the compiler, and you're back to no separate compilation).

This is not quite right: The linker is to be the *code generator*, not
the *compiler*.  Code generation is a (relatively) small subset of the
task of compilation.  Naturally, a code-generating linker will take
longer to run than a simple linking linker, which discourages this
somewhat.  The usual solution is to generate code in the compiler
proper only when it is told not to optimise.

>Anyway there is no portable way in draft ANSI C to say "this pointers
>are guaranteed to have no aliases".
>... you don't dare load dx[i+1] before you store dy[i] if there is
>any danger that they point to the same place.

True.

>What is to be done?

Ignore it.  (Unsatisfactory.)

Provide code-generating linkers.  (Good idea but hard to do.)

Provide `unsafe' optimisation levels.  (Generally a bad idea, but
easier than code generation at link time, and typically produces faster
compile times.)

Provide `#pragma's.  Some people claim that a pragma is not allowed to
declare such semantics as volatility or lack of aliasing; I disagree.
Short of the code-generating linker, with aliasing and register
allocation computed at `link' time, this seems to me the best solution.

	/*
	 * Double precision `ax + y' (d a*x plus y => daxpy).
	 */
	void
	daxpy(n, a, x, y)
		register int n;
		register double a, *x, *y;
	#pragma notaliased(x, y)
	/* or #pragma separate, or #pragma distinct, or... */
	{

		while (--n >= 0)
			*y++ += a * *x++;
	}

to write it in C-idiom-ese.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris at mimsy.umd.edu	Path:	uunet!mimsy!chris



More information about the Comp.lang.c mailing list