/lib/c2 optimizes calls instructions

Hans Albertsson hans at log-hb.UUCP
Sun Aug 25 02:31:51 AEST 1985


In article <5207 at elsie.UUCP> ado at elsie.UUCP (Arthur David Olson) writes:
> .....
>which says that the function "one", with its
>	L33:clrl	r10
>	calls	r10,_subr
>	sobgtr	r11,L33
>takes more time than function "two", with its
>	L46:calls	$0,_subr
>	clrl	r10
>	sobgtr	r11,L46
>loop.
>
>Comments?

You have changed the alignment, the "_subr" arg to calls is at a non-optimum
address, such as an odd word address or maybe even worse, in one,
assuming it's on an optimum address in two. The VAX architecture allows
any alignment for any part of any instruction, but the bus is fixed in
both width and memory access alignment, even for cache accesses. This 
may sometimes severely penalise seemingly optimum programs.

The difference 20 to 18 is furthermore very small in comparison with some
such effects I have seen. I remember once when removing completely an
inactive ( always false ) IF statement ( some 8 out of 30 lines, I seem to
remember ) DOUBLED both user and system times..... That felt VERY
humiliating, I can assure you. The code became smaller consistent with the
reduction of the source program size, but... We could find NO other
explanation.  ( The language was a very early stage of the TeleSOFT ADA. )
This would also seem to completely invalidate stuff like Dhrystone
benchmarks...

----------
This is the FIRST time ever I feel the need to point out that any
opinions expressed above are my own, and does not represent an official
TeleLOGIC opinion. In fact, some people here are likely to disagree
violently. With anything I say, on any subject...
-- 
Hans Albertsson, USENET/uucp: {decvax,philabs}!mcvax!enea!log-hb!hans
Real World:  TeleLOGIC AB, Box 1001, S-14901 Nynashamn,SWEDEN



More information about the Comp.bugs.4bsd.ucb-fixes mailing list