The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
Physical memory of a computing system is divided into allocation units called “pages.” These pages are distributed amongst the several processes executing on a given system. Some pages are allocated to the kernel, and therefore such pages are referred to as kernel pages, while other pages are allocated to one or more user processes (e.g., applications), and therefore such pages are referred to as user pages. Each physical page is the same size in bytes. For example, in some computing systems, each physical page is 8 KB long. Each physical page has a unique page frame number (PFN). A physical page's PFN may be determined by dividing the starting physical memory address of that physical page by the page size. Thus, in a system in which each physical page contains 8192 bytes, the PFN of a physical page that contains physical memory addresses 0 through 8191 is 0, the PFN of a physical page that contains physical memory addresses 8192 through 16383 is 1, and the PFN of a physical page that contains physical memory address 16384 through 24575 is 2. When processes request memory, it is allocated in a multiple of physical pages.
In addition to physical memory, computing systems may use virtual memory (VM) and a Virtual Memory Subsystem (hereinafter “VMS”) as part of managing allocation of the system's physical memory. VM uses slower storage media (usually disk) to store data that does not fit within the physical memory of the system. This enables programs larger than the size of physical memory to execute. The VMS optimally keeps frequently used portions of memory within physical memory and the lesser-used portions on the slower secondary storage.
The VMS provides a virtual view of memory, known as an address space, while the VMS transparently manages the virtual storage between RAM and secondary storage. In many computing systems that employ a virtual memory management scheme, virtual memory address space is segregated into “user” virtual memory address space and “kernel” virtual memory address space. Each executing user process has its own virtual memory address space allocated within the user virtual memory address space. The system kernel has its own kernel virtual memory address space. Physical pages of memory are mapped onto these address spaces. Some physical pages are mapped into the user virtual memory address space, and some physical pages are mapped into the kernel virtual memory address space. Inasmuch as multiple user processes may share the same data, some of the virtual memory address space of each of two or more user processes may be mapped to the same physical pages. In fact, a physical page that is mapped to user virtual memory address space may be concurrently mapped to kernel virtual memory address space, at least temporarily.
Each physical-to-virtual page mapping may have a corresponding entry in a Translation Lookaside Buffer (TLB), which is typically implemented in hardware. Usually, when a process attempts to access data at a particular virtual address, it invokes the VMS. The VMS first attempts to find the relevant virtual-to-physical page mapping in the TLB, using the virtual address as a key. If the VMS cannot find a relevant, valid mapping in the TLB (a circumstance called a “TLB miss”), then the VMS attempts to find a relevant, valid mapping in a Translation Storage Buffer (TSB), which is similar in structure to the TLB, but larger and slower, and typically implemented in software. If the VMS cannot find a relevant, valid mapping in the TSB (a circumstance called a “TSB miss”), then the VMS attempts to find a relevant, valid mapping in “page tables,” which are implemented as hash tables. If the VMS cannot find a relevant, valid mapping in the page tables (a circumstance called a “page fault”), then the VMS invokes a mechanism called the “page fault handler.” The page fault handler locates a relevant, valid mapping using information within kernel internal tables, which may refer to persistent storage. Significantly, the kernel internal tables are stored in physical pages that are mapped to the kernel virtual memory address space.
A computing system may comprise multiple system boards. Each system board may comprise one or more CPUs and some physical memory. Each system board has a different range of physical memory addresses that do not overlap with any other system board's range of physical memory addresses.
Sometimes, a particular system board may be experiencing errors. Under such circumstances, it may be desirable to remove that system board from the computing system.
A large computing system may be logically divided into multiple separate domains. Each domain may be allocated one or more system boards. Each domain may be used by a different group of users for different purposes. For example, one domain might be used to run a web server. Another domain might be used to run a database.
At some point in time, it may become desirable to change the allocation of system boards to domains. Under some circumstances, it might be desirable to change the allocation on a regular basis (e.g., daily), automatically and dynamically. It is better for such reallocation to be performed with minimum disruption to the computing system and the processes executing thereon. For example, it is better for such reallocation to be performed without shutting down and rebooting the entire computing system, because rebooting the entire computing system can be a relatively time-consuming process. Usually, user processes cannot execute during much of the time that a computing system is rebooting.
Whenever a system board is going to be removed from a computing system, or whenever a system board is going to be allocated to a different domain, the data stored in that system board's physical pages needs to be relocated to the physical pages of another system board. Relocation involves moving the data that is stored in one set of physical pages to another set of physical pages.
In the case of user physical pages and user virtual address space, this relocation may be readily accomplished. In these cases, virtual to physical mapping to the physical page is unloaded and the physical page is relocated to another page, and the virtual address of that physical page is simply revised to point to the new physical page address; when a process accesses the user virtual address, it will page fault and load the new mapping. The virtual address remains the same and the physical address changes. In the case of kernel pages and kernel virtual address space, however, special care must be taken.
According to current approaches, a page fault handler is not invoked in response to a page fault that involves a mapping of a physical page to the kernel virtual memory address space. This is because the kernel internal tables that contain the mapping for which the page fault handler would be searching are stored in a physical page that is, itself, mapped to the kernel virtual memory address space. If the contents of that physical page were currently being relocated, then the virtual memory subsystem would not be able to locate a valid virtual-to-physical page mapping for that physical page in the TLB, the TSB, or the page tables; all of the entries containing that mapping would have been invalidated due to the relocation. An unending recursive cascade of page faults and page fault handler invocations would likely result, overflowing the kernel stack and causing the entire computing system to fail.
Consequently, under current approaches, all of the kernel pages are confined to a limited subset of all of the system boards in a computing system, to compensate for the possibility that one or more of the system boards in that subset might be replaced at some point in time. Kernel physical pages are non-relocatable under current approaches. Optimally, they are allocated contiguously on the limited subset of boards, because it makes the process of removing system boards easier. If kernel allocations exist on all boards, those allocations cannot be moved and the boards cannot be removed.
This confinement of kernel pages to a limited subset of all of the system boards has some negative consequences. Thousands of user processes might be concurrently executing on various system boards. At any given moment, many of these user processes may cause accesses to the kernel pages (e.g., as a result of page faults). Because all of the kernel pages are located on the same limited subset of system boards under current approaches, the input/output resources of the system boards in the limited subset are often subject to heavy system bus contention. The overall performance of the entire computing system may be degraded as a result of system bus contention.
Under approaches such as those disclosed in the Related Applications, kernel pages may be relocated in a manner similar to techniques used for relocating the contents of user pages. This relocation can cause negative consequences in certain circumstances. For example, certain device drivers may seek to access memory directly, a technique known as Direct Memory Access (DMA). DMA is a technique for transferring data from main memory to a device without passing it through the CPU. Computers that have DMA channels can transfer data to and from devices much more quickly than computers without a DMA channel. This is useful for making quick backups and for real-time applications. Some expansion boards, such as CD-ROM cards, are capable of accessing the computer's DMA channel.
These drivers may allocate kernel pages, for example to perform data transfer. Kernel pages may be either solely accessed by virtual addresses or accessed by both virtual addresses and physical addresses. If the kernel pages are accessed solely by virtual addresses, the VMS will intercept DMA requests to those pages and redirect the request to the appropriate location. If the kernel pages are accessed by physical addresses, and the kernel pages have been relocated or freed, the DMA request will access the wrong physical address, with the result being data corruption or the cascade of page faults as described earlier.
In order to more accurately and efficiently relocate and free memory, and thereby enhance overall computing system performance, techniques are needed for allowing the identification of memory as relocatable or non-relocatable and acting on the resulting identification.