Modern operating systems are capable of concurrently executing a plurality of different processes. Generally, an individual process is an executable program, defined by code and data. It has a private virtual address space, which is a set of virtual memory addresses that the process can use. In addition, a process has one or more execution threads.
Operating systems use virtual memory to isolate different processes from each other to a certain degree, and to thereby prevent one process from interfering with another. With virtual memory, a process is assigned its own, private virtual address space, which is generally not available to other processes. Through its virtual memory, a process has a logical view of memory that does not correspond to the actual layout of physical memory. Each time a process uses a virtual memory address, a virtual memory system translates it into a physical address using a virtual-to-physical address mapping contained in some type of look-up structure and address mapping database. The virtual-to-physical address mappings are unique for each address space. The virtual memory system prevents processes from directly accessing the virtual memory used by other processes or by the operating system.
As mentioned above, a process has one or more execution threads. Generally, an execution thread comprises a sequence of processor instructions that execute in a single processor context. The particular elements of a thread's context vary depending on the microprocessor being used. For purposes of the discussion herein, however, a thread's context always includes its private memory stack or stacks. Therefore, by definition, a single thread always uses the same private memory stack. Any time the processor context changes to a different memory stack, the processor is said to be executing a different thread. In many cases, a thread's context also includes the volatile contents of a set of processor registers.
A processor is capable of executing only one thread at a time. However, a multitasking operating system allows users to run multiple programs and processors, while appearing to execute all of them at the same time. It achieves this in the following way:
1. It runs a thread until the thread's execution is interrupted or until the thread must wait for a resource to become available. PA1 2. It saves the thread's context. PA1 3. It loads another thread's context. PA1 4. It repeats this sequence as long as there are threads waiting to execute.
It is common for a computer processor and associated operating system to allow two different levels of resource availability. One level, referred to as a non-privileged mode or user mode, is used by application programs and other so-called "user" processes or programs. At this level, an execution thread is prevented by the operating system and by the computer processor from performing certain security-critical operations. The thread is also prevented from directly accessing many system resources. The purpose of the non-privileged execution mode is to isolate a user process as much as possible so that it cannot interfere with other user processes or with operating system functions. While a user process may itself crash, it should not be able to crash other programs or the operating system.
The other level of execution is referred to as privileged mode, system mode, or kernel mode. Critical operating system components are implemented in kernel mode--kernel-mode components are responsible for things like virtual memory management, responding to interrupts and exceptions, scheduling execution threads, synchronizing the activities of multiple processors, and other critical or sensitive functions. Such components, which execute from system mode, are generally referred to collectively as "the kernel." The kernel is responsible for supervising the virtual memory system in most computer systems.
A user-mode execution thread gains access to the operating system by calling system services or functions. However, a conventional program call or jump instruction to system code is not allowed because the operating system is located in a protected virtual address space that cannot be directly accessed by user-mode threads. Rather, the calling user-mode thread utilizes a special processor instruction, typically called a trap instruction. A trap instruction causes an automatic jump to a privileged-mode trap handler. One or more arguments are usually provided by the calling process, indicating which system function is desired. These arguments are passed in processor registers or on a memory stack. The trap handler suspends the calling thread and schedules a new thread for execution of the system function. When the system function has completed, the operating system reschedules the calling thread.
The Windows.RTM. NT operating system is an example of an operating system that implements multiple threads. The Windows.RTM. NT operating system is described in Inside Windows NT, by Helen Custer, published by Microsoft Press, 1993, which is hereby incorporated by reference. As described in this book at page 230, a trap instruction results in a switch to privileged-mode execution and a jump to the kernel. The kernel copies the caller's arguments from the thread's user-mode stack to another stack that is referred to as the thread's kernel-mode stack. The kernel then executes the system service. Thus, a context change is made (changing stacks) in response to a trap instruction; in accordance with the terminology adopted herein, a different thread (using a different stack) is initiated within the kernel and is used for executing the kernel itself and the called system function.
As noted above, arguments can be passed to a system function on the memory stack of the calling thread. In addition, it is desirable to be able to pass pointer arguments to a system function so that data can be specified by reference. Accordingly, the kernel and system functions must be able to access the virtual address spaces of calling processes.
In the Windows.RTM. NT operating system, this is accomplished as illustrated in FIG. 1. The operating system utilizes a large virtual address space 20 of 4 GB. Each application has its own user address space 22, which is limited to the lower 2 GB of the overall 4GB space. This lower 2 GB is accessible to both user-mode and kernel-mode threads. The upper 2 GB of the overall space forms a system address space 24. It accessible only to kernel-mode threads. Components of the operating system are located in system address space 24. Although the mapping of user address space 22 changes according to which user-mode process is currently executing, the mapping of system address space 24 remains constant.
The Windows 95.RTM. operating system is described in Inside Window 95, by Adrian King, published by Microsoft Press, 1994. This book is also incorporated by reference. The Windows 95.RTM. operating system utilizes a slightly different memory arrangement, as shown in FIG. 2. The lowest 1 MB of the virtual address space, referenced by numeral 26, is used for the currently executing MS-DOS process. Each such process also has a valid memory map within the 2 GB to 3 GB region, referenced by numeral 28. When a particular process is active, its memory map in region 28 is identical to the memory map of region 26. This mapping allows the operating system to address the memory of any MS-DOS process (by accessing an address in the 2 GB to 3 GB virtual memory region), regardless of whether the MS-DOS process is active. Pointers supplied to the operating system by the MS-DOS process, which reference region 26, are translated by the operating system so that they refer to the corresponding location in the 2 GB to 3 GB range 28.
The upper GB 30 of virtual memory is reserved for privileged-mode system components. The 4 MB to 2 GB region 32 is used as private or user virtual address spaces for 32-bit Windows applications. This region is used similarly to the user region of FIG. 1. Accordingly, it is accessible from the system components that execute from the upper GB region 30. However, the system region 30 is not accessible from any user virtual address space 32.
While the techniques described above are effective in many situations, there are some disadvantages. One disadvantage, addressed by the invention described below, is that the practice of using different threads to execute kernel and system functions imposes significant processor and resource overhead. When switching to a different thread, that thread must either be pre-allocated or must be created on demand. Pre-allocating threads wastes memory, since a potentially large memory stack is associated with each thread. Creating threads on demand is more efficient from a memory standpoint, but wastes the processor time required to create the new thread and associated data structures.
Another disadvantage relates to using trap instructions for invoking system functions. While conventional high-level program compilers use a recognized format for function calls, this same format cannot be used for system function calls because of the requirement for issuing a trap instruction. Thus, a compiler must perform a call to a special library routine, implemented in assembly language, to issue a trap instruction. Special arguments must be pushed onto the memory stack (on top of function arguments) specifying which system function is desired.