A kernel is the core of an operating system which provides basic services to other parts of the operating system. Typically, a kernel (or any comparable center of an operating system) includes an interrupt handler for handling requests or completed I/O operations that compete for the kernel's services, a scheduler that determines which processes (e.g., threads or programs) share the kernel's processing time, and in what order. The scheduler may also include a supervisor for giving use of the CPU to each process when it is scheduled. A kernel may also include a manager of the operating system's address spaces in memory or storage, sharing these among all components and other users of the kernel's services. A kernel's services are requested by other parts of the operating system or by application through a specified set of program interfaces sometimes known as system calls. A most basic of kernel processes is called a thread.
In general, a thread is a sequence of central processing unit (CPU) instructions or programming language statements that may be independently executed. Generally, a thread executes an operating system task. During the execution, a thread will execute a series of operations in a CPU. Generally, threads are executed using registers of an execution unit of the CPU. Although many threads can be operating at the same time, threads that require the use of the same CPU resources can not operate using the same system resources at the same time. Thus, in order to operate properly the threads are prioritized.
In particular, threads requiring access to the registers of the CPU execution unit can be prioritized to permit higher priority threads to be executed before lower priority threads. The process of determining which thread will be executed next is referred to as scheduling. In those systems having prioritized threads, scheduling of the prioritized threads is typically performed by an operating system service called a scheduler. The scheduler operates under the control of a scheduling algorithm. A typical scheduler includes a run queue into which each thread is placed. Each thread is assigned an integer priority (i.e., one that is bounded above and below by system constants) which is used to determine its priority in the queue.
FIG. 1 illustrates an example of a conventional run queue 100 of a scheduler. The run queue 100 is an ordered queue of n threads (thread 0 through thread n ) having k priority levels (0 through k). During operation, higher priority threads are executed before lower priority threads. This is facilitated by the scheduler. Also, during thread execution, certain threads can be activated and preempt the operation of lower priority threads. For example, during operation, a lower priority thread is preempted when a higher priority thread becomes active. The preempted thread (having a lower priority) ceases operation until the active higher priority threads have completed their operations. For example, if an executing thread T2 (having a priority of 1) is preempted by a thread T1 (having a priority of 0), then the thread T1 (having the highest priority of 0) is placed at the head of the queue in the scheduler and is executed while thread T2 waits in the queue until it becomes the highest priority thread.
In a most simplified example, a thread having the highest priority in the scheduler begins executing operations in accordance with its instructions, completes its operation, then ends. At that point a scheduler is accessed and the next highest priority thread is executed, and so on.
On the other hand, multiple threads may attempt to execute operations that involve the same CPU assets. Such a case is illustrated in FIGS. 2(a)–2(e). FIG. 2(a) is a timing diagram with the horizontal axis representing elapsed time t. Mapped on the vertical axis are threads T1 and T2, the operation of a real time operating system (RTOS), and the interrupt service routine (ISR).
At to a thread T2 is operating normally. FIG. 2(b) figuratively illustrates the operation of thread T2 in a computing device having a CPU 1. A portion of the CPU 1 is dedicated to the execution of threads (execution unit 2). Additionally, the CPU 1 and execution unit 2 operate in conjunction with memory 3. The execution unit 2 comprises a plurality of registers upon which the thread operates. In one example, the CPU execution unit 2 can comprise 32 registers of 32 bits each. Of course, other configurations are known and used by those having ordinary skill in the art. The inventors contemplate that the later discussed principles of the invention can be practiced in conjunction with any of the architectures.
Referring back to the example of FIG. 2(a), at t1 an ISR interrupts the ordinary operation of thread T2. In response, at t2, a context save operation Cs is performed. The content of each of the registers of the execution unit 2 is saved by the CPU to a dedicated memory location designated for storing thread T2. This process is referred to as context saving. The information contained in the registers of the execution unit 2 is referred to as the context. FIG. 2(c) figuratively illustrates an ordinary context saving operation. The current state (content) of all of the registers of the execution unit 2 (i.e., the entire T2 context 10) is saved from the execution unit 2 and stored to a dedicated location in memory associated with thread T2 (shown here by 4).
Referring again to FIG. 2(a), at time t3 the scheduler S is consulted in order to determine which thread has the current highest priority. For example, if we refer to FIG. 1, thread T1 has the highest priority (here, 0). As a result, thread T1 is the next thread to be implemented. With continuing reference to FIG. 2(a), at time t4 the context for thread T1 is restored CR.
Referring now to FIG. 2(d), a figurative illustration of conventional context restoration is shown. A previously stored context for T1 is retrieved from its designated memory location 5. The T1 context 11 is then restored to the execution unit 2 of the CPU 1. At this point, thread T1 continues executing its instructions. With reference to FIG. 2(a), the thread T1 continues operation until it is completely executed. At time t5, thread T1 completes operation. At this point, the scheduler is consulted. The next highest priority thread in the run queue 100 is then selected for execution. In the example depicted in FIG. 2(a), the next highest priority thread is thread T2. A context restore operation for thread T2 is then performed. The T2 context 10 is then retrieved from its dedicated memory location 4 and restored to the execution unit 2 of the CPU 1. At this point, the thread T2 begins operation at the point where it was interrupted at time t1. Because thread T1 completed executing this operation, there is typically no need for the context of the T1 thread to be stored. However, in the future other T1 threads will be implemented and, if interrupted, the T1 context will be stored to the dedicated memory location 5.
Although this procedure is adequate for its intended purposes, it suffers from a number of drawbacks. First, it inefficiently manages CPU execution unit resources. For example, regardless of the size of the thread being executed in the CPU execution unit, the entire context is saved. Also, regardless of what portion of execution unit registers the thread has operated on, the entire context is saved. This means, for example, in an execution unit having 32 registers, if an active thread operates in registers 0–7, but has not yet operated on, for example, registers 8–31, all registers 0–31 are saved. This means that the whole context is saved even though registers 8–31 do not contain information relevant to the saved thread. These inefficiencies slow down the operation of the CPU, resulting in longer processing times for tasks operating in the CPU. Additionally, when a thread is preempted in mid process (e.g., when an ISR occurs), disruptions may occur in the operation of the interrupted thread such that the thread becomes disabled. An undesirable consequence of a disabled thread is the possibility of system failure. Therefore, there is a need for thread operations which increase processing speed without disrupting the operation of interrupted threads. This need for increased performance and response time is especially desirable in a RTOS of embedded systems.
In view of the foregoing, it should be apparent that improved mechanisms and frameworks for processing threads in a multi-threaded computer system would be desirable.