An electronic processor, such as a microprocessor used in a personal computer or network device, contains registers that hold values upon which the processor's instructions act directly. To perform a simple integer addition instruction, for example, integers are loaded into each of two registers. The addition instruction operates upon the integers in the two registers and stores the resulting sum in a third register. Typically, the resulting sum is then saved in a memory location by another instruction. The three registers are then free to be used for a following instruction.
Typical processors have several registers. In some designs in the art, the registers are designed for general-purpose use. In others, particular registers are designed for specific purposes, such as to accumulate the results of calculations or to hold pointers to memory addresses. The collection of all such registers is known in the art as a register file.
When one program or program thread finishes executing and another begins, or an interrupt service routine is invoked, a context change must take place. This change is known in the art as a context switch. When a context switch takes place, the last values in the registers may have to be saved for later use. Once such values are saved, the context of the new thread must be loaded. The context consists at least of the address in memory of the first instruction of the thread, which is loaded into a program counter, and the initial values of registers. After the context switch, the first instruction of the new thread, as indicated by the program counter, is fetched and executed, operating upon the new initial register values.
For the purposes of the present disclosure it is important that some distinctions be made in parts of a processor. Depending on the nature of a processor for example, there may be relatively few distinct parts, or there may be a larger number of parts. Some processors are highly dedicated to a single function (embedded microcontrollers for example), and others have multiple functions, having such as fetch units, load/store units, and multiple functional units for executing instructions, such as integer units, floating point units, and branch prediction units. Also, the terminology in the art is loosely applied as to what precisely is a processor, a CPU, a microprocessor, and so on.
Regardless of the loosely applied terminology in the art and imprecise definitions, a clear distinction can be made for the purpose of the present descriptions. A distinct part of a processor of any sort executes instructions from an instruction stream. Other parts of a processor do not. For the purpose of the present description the part of any processor that executes instructions will be termed the instruction processor (IP).
In conventional processors in the prior art, a context switch is performed by a sequence of instructions executed by the IP, like any other program. Store instructions are executed by the IP to save register values to main memory or cache, and load instructions are used by the IP to fetch new register values from main memory or cache. Since there are typically several register values to be stored and several more to be loaded, a number of instructions are typically required, using a relatively long period of time in IP cycles. In processors with instructions that load and store blocks of memory, the process can be accelerated, but significant IP overhead is still involved.
Instructions that load and store from memory or cache typically take longer to execute than instructions that operate directly on register contents, so the time required for a context switch is even longer. For these two reasons, a context switch consumes a relatively large number of IP cycles. While the IP is occupied with a context switch, no useful calculations associated with the old or the new thread are executed. The context switch is thus pure overhead. Because context switching is a critical operation that must be completed before any other thread can run, no programs can be allowed to interrupt the context switch.
Having the IP completely unavailable for an extended period has been a major obstacle in system design. For example, it can interfere with the timely servicing of interrupts or the response to real time inputs.
In many systems context switching occurs so frequently that a relatively large proportion of processing capability is wasted. For example, in a personal computer where interrupts are used to service I/O devices, a context switch may occur each time an interrupt service routine is invoked. The burden of context switching is particularly high in real time systems that are required to respond quickly to many inputs. The bottleneck of context switching has remained a long-standing problem in processor design that has not been adequately addressed in the prior art.
The present invention represents a solution to this long-standing problem, at a cost of very little added logic in the system design. In the present invention a new register transfer unit loads and stores register values independently of the IP and in parallel with the processing of normal instructions. No load and store instructions are executed by the IP for data transfer to and from register files.