[4bsd-f77 #31] F77 sometimes can't handle mixed REAL and REAL*8
4.2 BSD f77 bug reports
4bsd-f77 at utah-cs.UUCP
Fri Aug 24 08:59:48 AEST 1984
From: Donn Seeley <donn at utah-cs.arpa>
Subject: F77 sometimes can't handle mixed REAL and REAL*8 expressions
Index: f77/src/f1/order.c 4.2BSD
Description:
When generating code for certain complicated expressions which
mix REAL and REAL*8 values, f77 will bomb out because it
underestimates the number of registers that the expression will
require. I found this bug in a long-lost report from Kirk
Thege at UC Berkeley.
Repeat-By:
Clip out the following program (based on a program by Kirk
Thege) and try to compile it. It doesn't matter whether the
optimizer is on or off, but the examples of code below assume
that it is on:
----------------------------------------------------------------
program xmpl
integer i
real a, b, c, d
real*8 x(3), y, z
i = 1
a = 1.5
b = 2.0
c = 2.5
d = 3.0
x(1) = 3.5
y = 4.0
z = (a + 1.0) * ((x(i) / b) + ((d / y) * c))
print *, z
stop
end
----------------------------------------------------------------
When you try to compile, the compiler will print something like
this:
----------------------------------------------------------------
xmpl.f:
MAIN xmpl:
xmpl.f, line 15: compiler error: no table entry for op REG
compiler error.
----------------------------------------------------------------
Fix:
The problem is that the code generator has painted itself into
a corner. If you run the code generator manually, you can see
what it's trying to do (code has been prettied up):
----------------------------------------------------------------
addf3 {0x4080},{a},r0
movl {i},r1
cvtfd {b},r2
divd3 r2,&{x}-8[r1],r2
cvtfd {d},r4
divd2 {y},r4
<bombs>
----------------------------------------------------------------
When it crashes, the code generator has just finished emitting
the code for '(d / y)', and now needs to multiply the value of
'c' into r4. Unfortunately, 'c' is a REAL and the expression
is of type REAL*8, so 'c' needs to be converted; but there are
no more registers available for the conversion, since we have
used up r0-r5 to hold temporary results of subexpressions. The
code generator frantically attempts to rewrite the expression
tree, and when it can't find anything appropriate it finally
runs over one of its many 'never happens' bugs and croaks.
Why didn't the code generator arrange to compute the value of
one or more subexpressions on the stack? The problem is that
the Sethi-Ullman algorithm which the code generator uses to
estimate its needs for registers is not getting the right
information; it is led to believe that it can get away with
fewer registers than are really required. Looking at the
intermediate code, we see the following (more prettied up
and ASCIIfied output):
----------------------------------------------------------------
oreg (double) {z}
oreg (float) {a}
name (float) L21
+ (float)
reg (double *) r11
int (long) &{x}-8
+ (double *)
oreg (long) {i}
int (long) 3
<< (long)
+ (double *)
* (indir) (double)
oreg (float) {b}
/ (double)
oreg (float) {d}
oreg (double) {y}
/ (double)
oreg (float) {c}
* (double)
+ (double)
* (double)
= (double)
----------------------------------------------------------------
The parser pass of the compiler does not generate conversion
operations from REAL to REAL*8! There are a few lines in
fixexpr() in f77/src/f77pass1/expr.c which show that this is
intentional. Apparently the code generator can handle certain
conversions of REAL to REAL*8 without recourse to explicit
conversion operators, since there are entries in the code table
for these contingencies. However the Sethi-Ullman algorithm
doesn't know about these implicit conversions and as a result
it thinks that a REAL operand requires the same space
regardless of whether it needs to be converted to REAL*8.
There are two solutions -- either force conversion operators to
be emitted or fix the Sethi-Ullman algorithm in the code
generator so that it knows about implicit conversions. I will
present a fix for the code generator, since both sets of
changes do the job, and the code generator fixes produce better
code.
There seems to be a bug in the code generator which the parser
pass is taking advantage of -- if an implicit REAL to REAL*8
conversion occurs to the left operand of an expression, then
the Sethi-Ullman algorithm allocates fewer registers for this
expression than for the corresponding expression with an
explicit conversion operator forcing the left operand to be
type REAL*8. There are special code table entries for REAL*8
expressions with REAL operands which can conserve registers.
The problem crops up because registers are really only
conserved if the REAL operand is on the left, where it can be
converted into REAL*8 in situ. For operands on the right an
extra register pair is required for a conversion. The obvious
change to the code generator is to force the Sethi-Ullman
algorithm to pretend that there is an explicit conversion
operation if an implicit conversion must apply to the right
operand but not the left. (A somewhat cleaner but more
extensive change would be to recognize explicit conversions and
handle them this way anyway...)
The changes are in the routine sucomp() in order.c:
----------------------------------------------------------------
*** /tmp/,RCSt1001501 Tue Jul 31 17:05:42 1984
--- order.c Tue Jul 31 15:44:47 1984
***************
*** 75,80
number, or local equivalent */
register o, ty, sul, sur, r;
o = p->in.op;
ty = optype( o );
--- 75,82 -----
number, or local equivalent */
register o, ty, sul, sur, r;
+ register NODE *temp;
+ int szr;
o = p->in.op;
ty = optype( o );
***************
*** 117,122
sul = p->in.left->in.su;
sur = p->in.right->in.su;
if( o == ASSIGN ){
/* computed by doing right, then left (if not in mem), then doing it */
--- 119,130 -----
sul = p->in.left->in.su;
sur = p->in.right->in.su;
+ szr = szty( p->in.right->in.type );
+ if( szty( p->in.type ) > szr && szr >= 1 ) {
+ /* implicit conversion in rhs */
+ szr = szty( p->in.type );
+ sur = max( szr, sur );
+ }
if( o == ASSIGN ){
/* computed by doing right, then left (if not in mem), then doing it */
***************
*** 154,159
p->in.su = max( max(sul,sur), 1);
return;
case PLUS:
case OR:
case ER:
--- 162,168 -----
p->in.su = max( max(sul,sur), 1);
return;
+ case MUL:
case PLUS:
case OR:
case ER:
***************
*** 159,165
case ER:
/* commutative ops; put harder on left */
if( p->in.right->in.su > p->in.left->in.su && !istnode(p->in.left) ){
- register NODE *temp;
temp = p->in.left;
p->in.left = p->in.right;
p->in.right = temp;
--- 168,173 -----
case ER:
/* commutative ops; put harder on left */
if( p->in.right->in.su > p->in.left->in.su && !istnode(p->in.left) ){
temp = p->in.left;
p->in.left = p->in.right;
p->in.right = temp;
***************
*** 163,168
temp = p->in.left;
p->in.left = p->in.right;
p->in.right = temp;
}
break;
}
--- 171,184 -----
temp = p->in.left;
p->in.left = p->in.right;
p->in.right = temp;
+ sul = p->in.left->in.su;
+ sur = p->in.right->in.su;
+ szr = szty( p->in.right->in.type );
+ if( szty( p->in.type ) > szr && szr >= 1 ) {
+ /* implicit conversion in rhs */
+ szr = szty( p->in.type );
+ sur = max( szr, sur );
+ }
}
break;
}
***************
*** 168,174
}
/* binary op, computed by left, then right, then do op */
! p->in.su = max(sul,szty(p->in.right->in.type)+sur);
/*
if( o==MUL||o==DIV||o==MOD) p->in.su = max(p->in.su,fregs);
*/
--- 184,190 -----
}
/* binary op, computed by left, then right, then do op */
! p->in.su = max(sul,szr+sur);
/*
if( o==MUL||o==DIV||o==MOD) p->in.su = max(p->in.su,fregs);
*/
----------------------------------------------------------------
After installing these fixes, the code generator succeeds in
processing the output of the parser pass, producing this:
----------------------------------------------------------------
movl {i},r0
cvtfd {b},r2
divd3 r2,&{x}-8[r0],r2
cvtfd {d},r0
divd2 {y},r0
cvtfd {c},r4
muld2 r0,r4
addd2 r4,r2
addf3 {0x4080},{a},r0
cvtfd r0,r0
muld3 r2,r0,{z}
----------------------------------------------------------------
Notice that the new code does not require stack temporaries to
store subexpression results... This is because the algorithm
is now smart enough to postpone evaluating the subexpression
(a + 1.0), which is a result of adding '*' to the list of
commutative operators. Without the latter change, correct code
is still produced but a stack temporary is required (I won't
present any more gory details).
This is a pretty sensitive change, I hope it doesn't have any
nasty side effects...
Donn Seeley University of Utah CS Dept donn at utah-cs.arpa
40 46' 6"N 111 50' 34"W (801) 581-5668 decvax!utah-cs!donn
More information about the Comp.bugs.4bsd.ucb-fixes
mailing list