function calls

Tim Olson tim at nucleus.amd.com
Fri Mar 16 02:15:21 AEST 1990


In article <14268 at lambda.UUCP> jlg at lambda.UUCP (Jim Giles) writes:
| Most machines implement call/return with single instructions.  However, this
| tends to be the tip of the iceberg for procedure call overhead.  The interface
| _MUST_ do the following:
| 
| 1) Save/restore all register values that are still 'live' values for the
|    caller.
| 
| 2) Test the stack to make sure there is enough room for the callee's local
|    variables (and call the memory manager for a new stack frame if there
|    wasn't room).
| 
| 3) Create (on the stack) a traceback entry so the debugger and/or the
|    postmortem analyzer can find the current thread through the call tree.
| 
| The problem is: _MOST_ of the procedure call
| overhead is concentrated in number (1)!  And, this overhead applies to all
| procedures - including 'leaf' routines.  Basically, 'leafness' has very
| little to do with procedure call overhead.

If you partition your registers into a set that is live across a
function call and a set that is killed, then leaf routines can run
entirely out of the later set, not having to save any registers.  For
example, in the Am29000 calling-convention, a function can allocate up
to 126 local registers which are saved across function calls and
also has use of 24 "temporary" registers which are not.  Most leaf
routines can run entirely out of these 24 temps.  The entire overhead
is simply a call instruction and a return instruction (2 cycles if the
delay slots are filled).

	-- Tim Olson
	Advanced Micro Devices
	(tim at amd.com)



More information about the Comp.lang.c mailing list