1. Field of the Invention
The present invention relates to software-based, distributed caching of memory object pages.
2. Related Art
Conventional computer systems include uni-processor computer systems, shared memory, symmetric multi-processing (SMP) systems and multi-processor, non-uniform memory access (NUMA) system. NUMA systems include distributed shared memory (DSM) multi-processor systems. SMP systems are also known as uniform memory access systems. Uni-processor systems and SNP systems generally employ a single main memory. In SMP systems, the single memory is shared by the multiple processors. In DSMs main memory is physically distributed among a plurality of processing nodes so that each node has some portion of main memory physically located adjacent to, or within, the processing node.
Conventional computer operating systems typically divide main memory into pages of physical memory. Conventional computer operating systems also typically generate a separate page frame data structure (PFDAT) to represent each page of physical memory. Each page frame data structures stores identification and state information for the page of memory that is represented.
Conventional operating systems generate data structures to represent memory objects. Memory objects are objects that can be mapped into an address space. Memory objects can include regular files as well as anonymous memory objects such as, for example, stacks, heaps, UNIX system V shared memory and /dev/zero mappings. A memory object can be backed by a disk file and, hence, can be larger than physical memory. The operating system manages which portion of the memory object occupies memory at any given time.
Computer operating systems need to know which portion of a memory object is contained in a particular page of memory. Thus, when a page of a memory object is stored in a page of physical or main memory, the PFDAT that is associated with the page of physical memory stores a memory object identification and logical offset. The memory object identification can be a pointer to the data structure that represents the memory object The logical offset identifies a portion of the memory object that is stored, relative to the beginning of the memory object. This information is used by the operating system to identify which PFDATs are associated with which memory objects and vice versa.
Operating systems can check or update state information in PFDATs. Operating systems can also search PFDATs for memory object identifications and logical offsets to determine, for example, whether a given page of a memory object is in memory. When a user directs the operating system to delete a memory object, for example, any pages in memory that are associated with the memory object must be found and de-allocated. Thus, a search is performed to find all PFDATs that have a pointer to the memory object. Similarly, if an operation is directed to a specific page of a memory object, a search is performed on PFDATs to determine which, if any, include the corresponding logical offset
In order to permit searching of PFDATs, conventional operating systems organize PFDATs into a global structure, generally called a page cache or global page cache. Global page caches can be implemented through any of a variety of techniques such as, for example, a linked list of PFDATs, a hash table of PFDATs, a tree structure of PFDATs, etc. The operating system can perform a variety of page cache operations on the global page cache. For example, when an action is to be performed on a page of a memory object, the global page cache is searched to determine whether any PFDATs correspond to the particular page of the memory object.
In multi-processor systems, some method of mutual exclusion is provided for controlling access to the global page cache. One such mechanism is a global lock. Many other suitable mechanisms are known to those skilled in the art and the exact method employed does not affect the present invention.
As a result of the mutual exclusion mechanism, however, each process which needs to perform a global page cache operation must wait until the global page cache is free or unlocked. Moreover, as the size of main memory grows, the number of PFDATs which must be searched increases. Thus, as the number of PFDATs increases, the length of time that processors spend waiting for searches to complete also increases. Increased latency of page cache operations limit the scalability of multi-processor systems.
In DSMs, there are additional drawbacks to global page caching. For example, the global page cache is stored in a portion of physical memory on one of the processing nodes of the DSM system. When a distant processor needs to perform a page cache operation, access must first be made to the node that stores the global page cache. This increases latency of page cache operations.
Global page caches in DSMs also cause hot spots to occur at the processing node that stores the global page cache. These hot spots are due to the frequency of accesses by many different processes. Thus, latency of page cache operations increase due to contention on limited bandwidth interfaces. As more processors are added, contention and, therefore, memory latency increases. Global page caching and global mutual exclusion mechanisms, thus, limit the number of processors and processes which can be effectively handled by conventional DSMs. Conventional global page caching techniques and global mutual exclusion mechanisms thus limit scalability.
What is needed is a system, method and computer program product for reducing memory access latency that results from global page cache operations and for reducing contention caused by global mutual exclusion mechanisms.