[4bsd-f77 #41] F77 bombs if it tries to perform code motion on strings
4.2 BSD f77 bug reports
4bsd-f77 at utah-cs.UUCP
Tue Sep 4 13:04:50 AEST 1984
From: Donn Seeley <donn at utah-cs.arpa>
Subject: F77 bombs if it tries to perform code motion on strings
Index: usr.bin/f77/src/f77pass1/optloop.c 4.2BSD
Description:
F77 wants to move CHARACTER expressions with unchanging
parameters outside of loops. Unfortunately it can't create
variable-length temporaries, so the compiler bombs. At the
same time f77 misses an obvious optimization to recognize
substrings of the form 'c(i:i)' as having constant length 1.
This bug was reported by Mike Brown at NOAO.
Repeat-By:
Try to compile the following program (from Mike Brown) with
the optimizer enabled:
----------------------------------------------------------------
character numstf*12, iarray*12
iarray = ' '
numstf = '0123456789+-'
c
ia = 1
i = 1
do 10 i = 1, 12
if (iarray(i:i) .eq. numstf(ia:ia)) go to 20
10 continue
20 continue
end
----------------------------------------------------------------
The compiler will complain:
----------------------------------------------------------------
brown/ch.f:
MAIN:
Error on line 12 of brown/ch.f: adjustable length
Termination code 138
----------------------------------------------------------------
A core dump of pass 1 will be left behind.
Fix:
The problem is that the loop optimization routine finds a
CHARACTER expression with a variable length that doesn't vary
over the course of the loop and tries to move the expression
out of the loop, but alas the stack temporary allocator can't
make temporaries of variable length, hence the error message.
The code that calls the allocator doesn't expect any problem,
and this mistaken assumption causes the core dump a little
later on.
The straightforward thing to do is to insist that the compiler
NOT try to move variable length strings out of loops. This is
a simple change to worthcost() in optloop.c:
----------------------------------------------------------------
*** /tmp/,RCSt1029138 Mon Aug 20 19:03:21 1984
--- optloop.c Sun Aug 5 17:05:29 1984
***************
*** 645,650
return NO;
case TADDR:
if ((memoffset = p->addrblock.memoffset) && ! ISCONST(memoffset))
return YES;
else if ((vleng = p->addrblock.vleng) && ! ISCONST(vleng))
--- 649,656 -----
return NO;
case TADDR:
+ if ((vleng = p->addrblock.vleng) && ! ISCONST(vleng))
+ return NO; /* Can't make variable length temporaries */
if ((memoffset = p->addrblock.memoffset) && ! ISCONST(memoffset))
return YES;
else
***************
*** 646,653
case TADDR:
if ((memoffset = p->addrblock.memoffset) && ! ISCONST(memoffset))
return YES;
- else if ((vleng = p->addrblock.vleng) && ! ISCONST(vleng))
- return YES;
else
return NO;
--- 652,657 -----
if ((vleng = p->addrblock.vleng) && ! ISCONST(vleng))
return NO; /* Can't make variable length temporaries */
if ((memoffset = p->addrblock.memoffset) && ! ISCONST(memoffset))
return YES;
else
return NO;
----------------------------------------------------------------
This solves the problem of the compiler crashing, but look at
the code that now gets generated for the loop:
----------------------------------------------------------------
movl $1,{ia}
movl $1,{i}
movl {ia},r10
movl {i},r9
movl $1,r9
L17:
subl3 $1,r10,r0
subl3 r0,r10,-(sp)
subl3 $1,r9,r0
subl3 r0,r9,-(sp)
movab {numstf}+-1,r0
addl3 r10,r0,-(sp)
movab {iarray}+-1,r0
addl3 r9,r0,-(sp)
calls $4,_s_cmp
tstl r0
jeql L2000000
acbl $12,$1,r9,L17
L2000000:
movl r9,{i}
ret
----------------------------------------------------------------
The function s_cmp() is being called to compare each byte of
the string. Notice that the length of each substring is being
computed explicitly with an expression like 'i - (i-1)'. Now
we all know that strings in Fortran are ridiculous, but this
takes the cake. The compiler knows that it can use simple byte
instructions on strings of length one; why can't it figure out
that 'iarray(i:i)' is a substring of length 1 and make an
efficient loop with byte instructions?
All it takes is a little test in the function mklhs() in expr.c
to notice that the two operands of a substring operation are
identical variables. (It would be nice if we could notice
identical expressions, but that's considerably more
complicated.) Here is the change:
----------------------------------------------------------------
*** /tmp/,RCSt1029315 Mon Aug 20 19:33:34 1984
--- expr.c Sun Aug 5 23:06:34 1984
***************
*** 1122,1129
if(p->lcharp == NULL)
p->lcharp = (expptr) cpexpr(s->vleng);
if(p->fcharp)
! s->vleng = mkexpr(OPMINUS, p->lcharp,
! mkexpr(OPMINUS, p->fcharp, ICON(1) ));
else {
frexpr(s->vleng);
s->vleng = p->lcharp;
--- 1137,1151 -----
if(p->lcharp == NULL)
p->lcharp = (expptr) cpexpr(s->vleng);
if(p->fcharp)
! {
! if(p->fcharp->tag == TPRIM && p->lcharp->tag == TPRIM
! && p->fcharp->primblock.namep == p->lcharp->primblock.namep)
! /* A trivial optimization -- upper == lower */
! s->vleng = ICON(1);
! else
! s->vleng = mkexpr(OPMINUS, p->lcharp,
! mkexpr(OPMINUS, p->fcharp, ICON(1) ));
! }
else {
frexpr(s->vleng);
s->vleng = p->lcharp;
----------------------------------------------------------------
Now the code for the loop becomes:
----------------------------------------------------------------
movl $1,{ia}
movl $1,{i}
movab {numstf}+-1,r0
movl {ia},r1
movb (r0)[r1],-1(fp)
movl {i},r10
movl $1,r10
L17:
cmpb {iarray}+-1[r10],-1(fp)
jeql L2000000
aobleq $12,r10,L17
L2000000:
movl r10,{i}
ret
----------------------------------------------------------------
Since the string length is now a constant, the compiler can
also perform the code motion that it was prevented from
attempting by the fix that was installed above... The loop is
now nice and compact, free of expensive subroutine calls.
Donn Seeley University of Utah CS Dept donn at utah-cs.arpa
40 46' 6"N 111 50' 34"W (801) 581-5668 decvax!utah-cs!donn
More information about the Comp.bugs.4bsd.ucb-fixes
mailing list