1. Field of the Invention
The invention relates to the field of computer processors and in particular to implementing context switching within such processors.
2. Related Art
In designing processors, there are a number of cost/performance tradeoffs. Higher performance often comes at the expense of interrupt overhead, interrupt latency, and context switch degradations.
FIG. 1 shows a traditional computer. A CPU (central processing unit) 101 executes tasks, interacting with a memory 104, via a bus 103. The memory 104 provides an instruction stream to the CPU 101. Also attached to the bus 103 are peripherals, e.g. 102. When the CPU receives an interrupt, at line 105, it can communicate over the bus 103 with the peripherals, e.g. 102. The peripherals are typically i/o (input and/or output) devices. The interrupts are asynchronous and arise with frequencies from 500 Hz to 100 kHz, depending on the nature of the application. Real time applications, such as video compression and decompression require interrupt rates towards the upper end of this range. Multimedia processing also requires frequent interrupts and context switching between processing different types of media, such as video and audio. There is some performance degradation with each interrupt, known as interrupt overhead. In order to support such real time applications, the processor must guarantee some maximum time between assertion and handling of an interrupt, i.e. a maximum interrupt latency.
Modern processors typically implement multi-tasking. FIG. 2 shows a timing diagram of a multi-tasking environment. In this environment, task 1 is swapped out, after an interval of time, in favor of task 2. Task 2 is then swapped out after an interval of time in favor of task 3. Task 3 is then swapped out, after a third interval of time in favor of task 1 again. With each swap, the processor must perform a context switch. As stated above, the primary reasons for swapping are expiration of a time slice allocated to a current task; the current task being voluntarily blocked, e.g. seeking i/o (input and/or output); or an interrupt freed a higher priority task, which will be discussed in more detail below.
During the context switch, the processor stores a data structure in memory 104 as illustrated in FIG. 3. This data structure includes a pointer 301, known as a "task handle" or "task i.d.". The pointer points to the location of the task record 302 which includes fields for the content of all registers in the processor 101, the stack pointer 304, the frame pointer 305, and the program control and status word PCSW 306.
To increase processor performance, the designer can add registers or increase cache size. Historically INTEL chips had 8 registers. Newer RISC chips have 32. While further increasing the number of registers should theoretically improve performance, performance degrades because more data must be stored during a context switch. As a result, the context switch takes longer. Where there are frequent context switches, as in multimedia processing, performance is seriously impacted.
One prior solution to the problem of lengthy context switches was the so-called "lightweight context switch". This idea was a partial solution which works only in the case of the current task being blocked (also called "voluntary block"). In the case of a voluntary block, the task is interrupting itself. Accordingly, it knows which registers it is using and causes the task record to contain the contents of those registers, and only those registers, which the task is actually using. The lightweight context switch offers no advantages with the other two types of context switches, i.e. time slice expiration and higher priority tasks being freed.
This leaves room for additional improvements that work more generally in the case of all types of context switches.