RISC versus CISC

Tue Feb 14 16:44:24 AEST 1989

In article <15686 at mimsy.UUCP> folta at tove.umd.edu (Wayne Folta) writes:
> ...I had assumed that a RISC machine had a much smaller
> and simpler instruction set.  That is, fewer instructions, each of which
> did simpler things than a CISC instruction set.  But how can this make a
> machine that much faster?  Is it because most CISC machines are
> microcoded?

Partially.

> This additional level of instruction execution could add
> overhead.  Is it because a smaller instruction set requires fewer bits to
> encode each instruction?  This would make fetches somewhat faster.

No -- RISC instructions are typically 32-bits long, and have a more
sparse encoding than CISC instructions.

> It seems to me that to accomplish the same work, the RISC machine would
> just have to execute more instructions than the CISC machine.

Yes, that is true (most of the time).

> So where have I gone wrong?  How is it that--if indeed it
> is--RISC beats CISC by large margins?

Remember, instructions != cycles.  For a RISC machine to be faster than a
CISC machine, it simply must take fewer cycles to complete the overall
program, even if this means executing more instructions:

							1
	Performance = 1/sec = cycles/sec * -----------------------------
					   cycles/inst  *  [total inst]

Thus, we can improve performance by raising the cycles/sec (increasing the
clock frequency; basically a processing problem), decreasing the total
number of instructions executed (by making them complex: CISC), or
decreasing the number of cycles than an instruction requires (by making
them simple: RISC).  Note that these variables are not independant; it is
hard to make very complex instructions run fast, etc. 

That is the view from the hardware side.  However, software (specifically
optimizing compilers) play just as important a role in the RISC
performance picture.  One can make the argument that RISC & CISC look very
similar at the "micromachine" level, and that the fetching of a
microinstruction from the microcode on a CISC machine is somewhat like a
RISC machine fetching an instruction.  Now the CISC machine has hard-wired
microcode to execute from, while the RISC machine instructions are
"custom-tailored" by the compiler for the problem at hand.

For example, let's look at a typical loop:

	for (i=0; i<MAX; ++i)
		a[i] = 0;

A CISC machine may have a single instruction that performs the inner
statement, by using an indexed base+offset addressing mode.  However, each
time through the loop it must fetch the 32-bit base address of the array
"a", multiply the index variable i by the size of the elements of a, add
the two values together to form an address, then store 0 out to that
location.

A highly-optimizing compiler can recognize that the base of the array
never changes (so it can be computed in a register before the loop begins
[loop-invarient code motion]), and we can increment this address by the
size of each element, rather than incrementing by 1 and then multiplying
(or shifting) [strength-reduction].  Now the loop consists of a few,
simple instructions (store, add, compare, branch), which matches nicely
with what is provided by the RISC machine (and they are performed quickly,
because they are executed directly instead of being interpreted by another
level of microcode).

	-- Tim Olson
	Advanced Micro Devices
	(tim at crackle.amd.com)