1. Field of the Invention
The invention relates to reduced instruction set computer (RISC) processor architecture. More particularly, the invention relates to a processor architecture designed to substantially improve processing speed in real time I/O intensive applications.
2. State of the Art
One of the many known methods for increasing throughput in a microprocessor is known as "pipeline processing". Pipeline processing involves overlapping the execution of several instructions by temporally offsetting each subsequent instruction. In order to implement pipeline processing effectively, it is preferable that each instruction in the processor's instruction set utilize the same number of clock cycles. For example, in a case where each instruction utilizes exactly n-number of clock cycles, a pipeline of n-number of instructions can be created with each subsequent instruction being offset from the previous instruction by one clock cycle. In such a system of pipeline processing, the processor effectively processes one full instruction each clock cycle. One of the achievements of RISC processor design is the definition of an instruction set in which the execution of all, or most, instructions require a uniform number of cycles. A discussion of the general background of RISC can be found in "MIPS R-2000 RISC Architecture" by G. Kane (Prentice Hall, 1987) the complete disclosure of which is hereby incorporated by reference herein.
A popular prior art RISC architecture is the MIPS I Instruction Set Architecture (ISA). MIPS is a simple but high performance RISC architecture which has attracted enormous third-party support. The MIPS I and MIPS II ISAs are well documented in "MIPS RISC Architecture" by G. Kane and J. Heinrich (Prentice Hall, 1992), the complete disclosure of which is hereby incorporated by reference herein.
The MIPS R-2000 processor executes instructions in five portions (one per clock cycle) and the instruction pipeline is a five stage pipeline, one stage per instruction portion. The five instruction portions are instruction fetch (IF), read operands from registers while decoding instruction (RD), perform operation on instruction operands (ALU), access memory (MEM), and write back results to a register (WB). Prior art FIG. 1 illustrates the MIPS pipeline with five instructions offset from each other by one clock cycle. As shown in FIG. 1, during the cycle in which the first instruction is writing back results to a register (WB), the second instruction is accessing memory (MEM), the third instruction is performing an operation on instruction operands (ALU), the fourth instruction is reading operands from registers while decoding instruction (RD), and the fifth instruction is fetching the instruction (IF) from instruction RAM. Additional background on the MIPS pipeline may be found in "Computer Organization and Design: the Hardware/Software Interface", by D. A. Patterson and J. L. Hennessey (Morgan Kauffmann, 1994), the complete disclosure of which is hereby incorporated by reference herein.
The instruction pipeline in RISC architecture achieves a certain amount of operational "parallelism". In the example shown in FIG. 1, once the pipeline is full, five instructions are executed in parallel. Although each instruction still requires five clock cycles, a new instruction can be added to the pipeline each clock cycle to keep the pipeline full. So long as the pipeline is full, the RISC processor may continue to process instructions at the effective rate of one instruction per clock cycle, provided there are no stall cycles, NOP instructions, or aborted pipelines.
Those skilled in the art will appreciate that inherent latencies exist for load, jump, and branch instructions and that some instructions may require data which is not yet available. These conditions are referred to as processing interdependencies. One way to resolve interdependencies is to stall or delay the pipeline. Another way (utilized by the R-2000) is to insert NOP (no operation) instructions in the pipeline to account for latency between instructions. The insertion of NOP instructions is effected by the software assembler when a program is compiled. It will also be understood that exceptions (e.g., interrupts) interfere with the smooth flow of the pipeline. When an R-2000 detects an exception, for example, the instruction causing the exception is aborted and all instructions in the pipeline which have started execution are aborted. A jump to the designated exception handler occurs. After the exception is processed, the processor returns to the instruction which preceded the instruction which was executing when the exception occurred. Interrupt handling robs processor cycles and degrades system performance. If interrupt handling is not efficient, the performance advantages of pipeline processing may be lost.
Most modern processors, including RISC processors, support multiple simultaneous processes and/or multithreaded processes. When running several different programs on a single processor (multiple simultaneous processes) or when running a multithreaded processes, it is necessary for the processor (or operating system) to switch from one program or thread (context) to another. Context switching is often performed according to a priority schedule whereby some processes are given more processing time than others. Theoretically, context switching can improve system performance by switching to a new context whenever a process or thread is stalled waiting for an I/O device and by returning to the stalled process or thread when it is ready to run. In practice, however, context switching tends to prevent optimum system performance because extra processing cycles (128 cycles in the case of a MIPS processor) must be used to switch contexts and no process instructions are executed during the context switch. During a context switch, the contents of all immediate registers (also called general purpose registers, i.e. registers which are directly read from or written to by the ALU of the processor) which describe the state of the current process are saved to RAM before switching to another process. After saving the current state (context), the next context is loaded from RAM into registers before the next process can be run. This non-productive processor activity (saving and restoring register contents) can adversely affect overall performance, particularly in a real time event driven system where context switches are largely governed by I/O activity.
Even with a single thread program, context switching may occur often. For example, the MIPS R-2000 ISA has two operating modes: user mode and kernel mode. Each of these modes is a different context and the programmer may create several "user mode" contexts, each for a different thread. However, even with a single user mode context, context switching between the user mode context and the kernel context may occur frequently. According to the MIPS ISA, the CPU enters the kernel mode whenever an exception is detected and remains in kernel mode until a Restore From Exception (RFE) instruction is executed. Consequently, in an event driven application, frequent context switches can be expected regardless of the number of threads-in user modes.
The relative high speed of RISC processors make them an ideal choice for telecommunications applications including SONET and ATM applications. Despite the power of RISC processors, however, the extremely high demands of SONET and ATM telecommunications tax the resources of RISC processors, particularly with regard to interrupt handling and context switching. It will be appreciated that telecommunications in general is almost entirely real time event driven and that the high volume, broad band communications provided via SONET and ATM is even more so.