1. Field of the Invention
The present invention relates generally to computer operating systems and more particularly to a method for promoting and demoting between system calls and fast kernel entries.
2. Description of the Prior Art
Threads are programming constructs that facilitate efficient control of numerous asynchronous tasks. Since they closely map to the underlying hardware, threads provide a popular programming model for applications running on symmetric multiprocessing systems. As standard thread interfaces, such as the POSIX P1003.4a portable operating systems programming standard propagated by the Technical Committee on Operating Systems of the IEEE computer Society, become more common, an increasing number of portable applications employing threads are being written and more operating system vendors are providing thread support.
Threads can provide significant performance gains over sequential process execution. Applications that can take particular advantage of threads include, for example, database servers, real-time applications and parallelizing compilers.
Because kernel system calls are relatively slow compared to local thread operations, various techniques have been tried to minimize the use of system calls to increase system performance. Some prior art thread implementations for UNIX-based systems are designed to minimize the number of calls into the UNIX kernel by developing local thread libraries in user memory space. Local threads are typically multiplexed onto a smaller number of kernel-level entities. In a simple implementation, all user-level threads are multiplexed onto a single kernel-level thread. In more sophisticated implementations, the number of kernel-level entities varies with the number of CPUs that are assigned to the particular process. Thread libraries typically require a complex algorithm to bridge the gap between the user address space thread library and the kernel information. Since data integrity constraints typically require that applications be split into multiple processes and shared system services often reside in the kernel, multithreaded applications cannot avoid making substantial use of global, inter-address space thread operations in addition to local thread operations. Those thread operations that cannot be performed in local user address space must typically use relatively slow kernel system calls.
Other prior art systems have used primitives based in the kernel space. These kernel-based implementations take advantage of fast kernel trap instructions available in commercially available reduced instruction set computers to rapidly access kernel primitives to implement fast interprocess communication and other operations. The overhead associated with a fast kernel trap instruction is typically an order of magnitude less than the overhead associated with a system call and kernel-based threads provide a number of advantages such as good scalability, high reliability, optimal assignment of physical processors, minimal dispatch latency and more efficient inter-process synchronization.
Problems with many prior art systems employing fast kernel traps arise in the event that a complication, such as a software interrupt or a data access exception, occurs while the fast trap into the kernel is in progress. A data access exception could be caused by a bad memory address provided to the trap instruction by the user or a page fault, such as a read fault caused by the particular memory page being addressed not being resident in the main system memory, or a write fault caused, for example, by trying to store a value in a write protected memory location.
A fast kernel trap, by its nature, does not have the same ability to handle exception that is incorporated into the larger, slower kernel system call and many prior art systems have used awkward or undesirable techniques to handle the situation. For example, least one prior art system has utilized a nested exception handler to return a status code to the user indicating that a complication has occurred. In the event of a read fault, for example, the user application is forced to briefly “touch” the memory to cause the appropriate memory page to be retrieved from mass storage and placed in shared memory. This sort of solution is inconvenient to use. A method for handling complications in the kernel space without user intervention would be useful.