1. Field of the Invention
The present invention relates generally to microprocessors and operating systems.
2. Description of the Background Art
It is common for a computer processor and associated operating system (OS) to have two different levels of resources and protection. One level is referred to as a non-privileged mode or user mode. This mode is typically used by various operating system components, application programs, and other so-called “user” processes or programs. At this level, an execution thread is prevented by the operating system and by the computer processor from performing certain security-critical operations. The thread is also prevented from directly accessing many system resources. The purpose of the non-privileged execution mode is to isolate a user process as much as possible so that it cannot interfere with other user processes or with operating system functions. While a user process may itself crash, it should not be able to crash other programs or the operating system.
The other level of execution is referred to as privileged mode, system mode, or kernel mode. Critical operating system components are implemented in kernel mode. Kernel-mode components are responsible for things like virtual memory management, responding to interrupts and exceptions, scheduling execution threads, synchronizing the activities of multiple processors, and other critical or sensitive functions. Such components, which execute from system mode, are sometimes generally referred to collectively as “the kernel.”
The kernel is responsible for supervising the virtual memory system in most computer systems. The virtual memory system is largely responsible for isolating processes from each other. With virtual memory, a process is assigned its own virtual address space, which is not available to other processes. Through its virtual memory, a process has a logical view of memory that does not correspond to the actual layout of physical memory. Each time a process uses a virtual memory address, the virtual memory system translates it into a physical address using a virtual-to-physical address mapping contained in some type of look-up structure and address mapping database.
FIG. 1 shows certain components of a conventional computer system. The illustrated components of the computer system 40 include a microprocessor 41 and a computer-readable storage medium such as memory 42. Although FIG. 1 shows only a single processor, the system might include multiple processors. The multiple processors may be used by multiple different processes or tasks, each having one or more execution threads. The terms task and process as used in this description refer to executing entities within a computer system having their own virtual address spaces. A thread is characterized as having its own context (including registers and memory stack) and as being independently subject to the kernel's scheduling algorithms. The computer system 40, of course, also includes other components that are not shown.
In accordance with conventional computer systems, the computer system 40 includes an operating system and one or more application or user programs that execute in conjunction with the operating system. FIG. 1 shows a portion 43 of the operating system referred to as the kernel, and a single application or user process 44. Although only one user process is shown, a plurality of user processes typically execute from memory 42. As described above, the processor 41 may operate using privileged and non-privileged execution modes. User processes and threads typically run in the non-privileged execution mode, and make calls to system or kernel functions that execute in the privileged execution mode. Additional kernel functions and threads also run in the privileged execution mode to deal with memory faults and other interrupt-based events in accordance with conventional operating system characteristics.
Memory in a computer typically comprises a linear array of bytes. Each byte has a unique address known as its physical address. However, many microprocessors do not typically address memory by the memory's physical address. Instead, memory is addressed using virtual memory addresses. A virtual memory address, which is commonly known as a virtual address, is an address of a location in virtual memory.
Virtual memory addressing is a technique used to provide the illusion of having a memory space that is much larger than the physical memory available in a computer system. This illusion allows a computer program to be written without regard to the exact size of physical memory. One benefit of virtual memory addressing is that a computer program can easily run on a computer with a wide range of memory configurations and with radically different physical memory sizes. Another benefit is that a computer program may be written that uses a virtual memory size that is much larger than the physical memory available on a particular computer system.
Virtual memory may be thought of as a collection of blocks. These blocks are often of fixed size and aligned, in which case they are known as pages. A virtual address may often be broken down into two parts, a virtual page number and an offset. The virtual page number specifies the virtual page to be accessed. The offset indicates the number of memory bytes from the first memory byte in the virtual page to the addressed memory byte. Physical addresses, which represent where data actually resides in physical memory, may also be broken down into two parts, a physical page number and an offset. The physical page number specifies the physical page to be accessed. The offset indicates the number of memory bytes from the first memory byte in the physical page to the addressed memory byte.
A virtual address must be mapped into a physical address before physical memory may be accessed. The mapping is often maintained through a table, known as a page table. The page table contains virtual to physical memory translations. A virtual to physical memory translation consists of a virtual page number and a corresponding physical page number. Because virtual addresses are typically mapped to physical addresses at the level of pages, the page table may be indexed by virtual page numbers. In addition to virtual to physical memory translations, the page table may often contain other information such as the disk locations where pages are stored when not present in main memory and an indication of whether pages are present in memory or residing on a disk. Typically, the operating system inserts and deletes the virtual to physical memory translations that are stored in the page table. In other words, the page table is managed by the operating system.
Virtual memory requires two memory accesses to fetch a single entry from memory. The first access is into the page table. This access is used to map the virtual address into the physical address. After the physical address is known, then a second access is required to fetch the data. In an effort to speed up memory accesses, conventional microprocessors use a special-purpose cache memory to store certain virtual to physical memory-translations. This special-purpose cache memory is often called a translation lookaside buffer (TLB). The number of virtual to physical memory translations in a TLB is typically smaller than the total number of translations in the page table.
When a microprocessor addresses memory through a TLB, the virtual page number that is included in the virtual address is used to interrogate the TLB. If the virtual page number is stored in the TLB, then the TLB outputs the physical page number that maps to the virtual page number. Sometimes the TLB does not contain the virtual page number. This is known as a TLB miss. When a TLB miss occurs, the microprocessor typically requests the operating system to supply the physical page number from the page table. After the operating system supplies the physical page number, the physical memory is addressed. When the operating system supplies the physical page number, an undesirable delay occurs.
A memory stack is a region of reserved memory in which programs store status data such as procedure and function call return addresses, passed parameters, and local variables. The microprocessor, the program, and the operating system can all maintain one or more separate memory stacks.
Logically, a stack may comprise a memory structure organized, for example, as a LIFO (last in, first out) list such that the last data item added to the structure is the first item used. A program can put data onto the stack by executing microprocessor instructions. For example, a “push” instruction typically writes a specified microprocessor register to the stack. A “pop” instruction reads data from the stack. The microprocessor often writes to and reads from the stack automatically in response to certain program flow instructions and other events such as memory faults or interrupts.
As shown in FIG. 2, a memory stack (whether a user memory stack or a kernel memory stack) is typically implemented as a region of virtual memory beginning at a stack base or stack base address. FIG. 2 shows a portion of virtual memory from top to bottom in order of increasing virtual memory addresses. The memory stack is indexed by a pointer referred to as the “stack pointer.” When writing to the stack, the microprocessor decrements the stack pointer to the next available address, and then writes the specified data to that address. When reading from the stack, the microprocessor reads from the virtual memory location currently referenced by the stack pointer, and then increments the stack pointer.
A register stack may comprise a number of general registers of a microprocessor, which have been designated for the storage of data required by, and pertaining to, procedures of a stored program being executed by the microprocessor. Specifically, upon execution of a particular procedure, a stack frame associated with that procedure is created within the register stack. The stack frame includes saved registers (containing variables local to relevant procedure), an activation record and a frame marker. When a procedure calls a further procedure, the called procedure in turn creates a further stacked frame on top of the stacked frame for the calling procedure. Accordingly, for a program in which a number of nested procedures are being executed, the register stack may include a corresponding number of stack frames.
A register stack accordingly allows multiple procedures to effectively share a large register file by stacking associated stack frames in both on-chip registers and off-chip memory. The call/return patterns of typical programs exhibit high call/return frequencies with small amplitudes. A register stack significantly reduces the number of stores (i.e., register saves) at procedures calls, and reduces the number of loads (i.e., register restores) at procedure returns.
In accordance with some microprocessors, a register stack engine (RSE) may be used to manage the register stack, saving and restoring the physical registers to and from memory as needed. The memory allocated for the dynamic reading and writing of the registers by the RSE is sometimes called the backing store.