RISC vs CISC
Revision as of 23:49, 27 August 2011
In the early days of computing, you had a lump of silicon which performed a number of instructions.
As time progressed, more and more facilities were required, so more and more instructions were added. However, according to the 20-80 rule, 20% of the available instructions are likely to be used 80% of the time, with some instructions only used very rarely. Some of these little-used instructions are very complex, so creating them in silicon is an arduous task. Instead, the processor designer uses microcode.
To illustrate this, we should consider a modern CISC processor, such as the Intel Atom® in my eeePC. When the processor is asked to execute x86 instructions, it translates them into "micro-ops" which are effectively RISC instructions which are executed on the RISC core.
The CISC design principle was to create an instruction set that, as much as possible in hardware, facilitates programming. High level languages were not common, the manjority of coding was done using assembler, so the idea was to make the processor able to do more and more.
While this may sound like a bizarre thing to say, we arrive at a situation where we can have processors that understand high level concepts, such as arrays and array boundaries.
The problem with CISC processors is that in order to represent this in actual hardware, it is accepted that there will be quirks and limitations. For example, the x86 processors have specific "general purpose" registers (EAX, EBX, ECX, EDX) in various guises (AL/AH, etc) that contain built-in requirements, for example multiply instruction places the result in (E)AX. You don't have a choice.
The RISC principle, on the other hand, takes the alternative approach. It is important to understand that RISC means Reduced Instruction Set Computer, not Reduced Instruction Set Computer. What this means is that the RISC principle is not in reducing the number of available instructions, but rather, in reducing their complexity. Take, for example, if we had the instruction:
LDXORAD A, [B], C
which would load a word from the address pointed to by 'B', exclusive OR it with C, and then add it to the contents of A (the result being written to A). This fictional instruction is an example of CISC.
RISC, on the other hand, would:
LDR D, [B] ; load word pointed to by B into D EOR D, C, D ; EOR D with C, result stored in D ADD A, A, D ; Add A and D, result ending up in A
We introduce an additional register, D, so B and C do not need to be corrupted. It is for this reason that RISC processors tend to have a fairly large number of registers, though this is not a rule.
One thing RISC does offer, though, is register independence. The ARM defines at minimum R15 as the program counter, and R14 as the link register (although, after saving the contents of R14 you can use this register as you wish). R0 to R13 can be used in any way you choose, although the Operating System usually defines R13 to be used as a stack pointer. You can, if you don't require a stack, use R13 for your own purposes. APCS applies firmer rules and assigns more functions to registers (such as Stack Limit, Frame Pointer, etc). However, none of these - with the exception of R15 and sometimes R14 - is a constraint applied by the processor. You do not need to worry about saving your accumulator in long instructions, you simply make good use of the available registers.
The x86's offers registers, A, B, C, D which are not just "starting from A" but have meanings - Accumulator, Base, Count, and Data - which give a clue as to their intended functions, as to the segment registers (in a 16 bit world) for Code, Data, Extra, and Stack; then SI and DI which are indexing registers. The x86 is somewhat less flexible than the ARM in its register use. That, coupled with fewer registers...
The RISC principle is, therefore, to reduce the instruction set to building blocks, as we know that any complex instruction can be built from a sequence of simpler instructions. But not only that, we know also that in each case we can customise our code for the best possible performance, rather than following the quirks of a complicated instruction set.
RISC suffers two problems. If you place identically clocked RISC and CISC processors side by side, the CISC will always win. This is because the CISC processor is able to perform "more" per instruction than the RISC, and even multi-cycle instructions are liable to execute more quickly than the equivalent series of RISC instructions. In addition, code density is less so an application in RISC is likely to consume more memory than the equivalent in CISC.
These RISC problems are far outweighed by the benefits of RISC. RISC is often a lot friendlier to program. You have to think at a lower level exactly what you want to achieve, but once you have done so, the processor aids, not hinders. A loop within a loop within a loop is a simple thing to achieve. Doing likewise on a CISC may involve PUSH and POP of the loop counter register, or some other heinous method of shifting things around to please the processor. Then we get into silicon complexity. RISC processors tend to be quite simple designs - indeed the ARM is not microcoded, it is bare silicon all the way. This means it is cheaper to produce and can be more efficient. The ARM processors used in the majority of Android mobile phones can extract a healthy life from a tiny battery, and even when playing video they can remain cool within a hermetically sealed container (such as the Motorola DEFY which is, to a degree, waterproof!). Compare the x86 family, which are physically larger, consume rather more power, and create sufficient heat that even an Atom-powered netbook requires a fan running on tick-over to keep the machine cool. I can feel the keyboard of my eeePC slightly warm under my hand, and all I'm doing is writing this Wiki page.
Arguably RISC has won the war, for CISC processors use microcoded RISC instructions internally. However in the world as we see it, most processors have a place - the 2GHz+ multi-core x86 kit for heavy number crunching, the long-data-word ST20 processors which lend themselves to digital video applications, the Atmels and the PICs for itty-bitty embedded designs, and the ARM carving itself a niche in the ultra-low-power market. The ARM lends itself to clean, concise code which is simpler to work with and understand.