1. Field of the Invention
The present invention relates generally to allocation of memory in a computer system and more particularly to a system, method, and computer program product for dynamically allocating large memory pages.
2. Related Art
As the speed and capacity of computer hardware systems continue to increase, so have the demands of application programs that run on them. Recently, a new breed of application programs have emerged that require huge amounts of main memory during execution. For example, it is not unusual for database and data mining applications to demand thousands of megabytes of memory allocation during program execution.
Generally, most operating systems manage memory in terms of individual memory pages. More specifically, memory in the computer system is divided into a number of memory pages. Such memory pages typically have a fixed size such as 4 or 16 Kilobytes (KB) per page. Accordingly, when an application program requests a specific amount of memory during program execution, a particular number of memory pages are reserved for that purpose.
For example, in a system with 16KB memory pages, when an application program requests 256 KB of memory, the operating system reserves 16 pages of physical memory for the application program making the request.
Once a memory page is allocated for a particular process, an entry is added to a page table. Generally, each process has an associated page table, although this may not be the case in all systems. The page table entry maps a physical memory page to the virtual address space associated with the process. Accordingly, when the process needs to access the allocated memory page, the page table entry is consulted to find the corresponding physical address for the requested memory page.
In order to increase overall system performance, most computer systems employ a cache system for frequently and or most recently used page table entries (PTEs). One example of such a cache system is a translation look aside buffer (TLB). TLBs are generally implemented in hardware and comprise information similar to that in the page table. That is, like the page table, TLBs are used to map the virtual address of memory pages to the physical address of memory pages. Because the TLBs are implemented in hardware, address translations are generally much faster than translations using page tables, which are generally implemented in software.
Accordingly, many computer systems maintain translations associated with the most recent and/or the most frequently accessed memory pages in TLBs.
As such, whenever a process needs to access a memory page, the system-wide TLB is searched first. The much slower process of searching (or "walking through") the page table is performed only when the desired translation is not found within the TLB.
When an address translation is found within the TLB, a TLB cache hit or more simply, a TLB hit occurs. When an address translation is not found within the TLB, and the system must resort to a page table walk-through, a TLB miss occurs. For increased system performance it is clearly desirable to minimize TLB misses and maximize TLB hits.
However, because the TLB is implemented in hardware, it generally has a limited capacity that cannot be changed. For example, a MIPS R10000 RISC microprocessor, manufactured by MIPS Technology, Inc., has a capacity of 64 TLB entries, wherein each TLB entry comprises 2 sub-entries. The sub-entries are related to one another in that they map adjacent memory pages. Thus, at most, such a TLB has a capacity to store 128 page translations. For a system that defines fixed sized 16 KB memory pages, this translates into a TLB range of 16 KB * 128, or 2 megabytes (2 MB).
For applications requiring a large amount of memory pages, the TLB can quickly become exhausted thereby causing an inordinate amount of TLB misses. For example, suppose during program execution, a large database program allocates a total of 10 MB of memory. This translates into 640 16 KB memory pages. In a system with a TLB capacity of 128 entries, at most, accesses to 128 pages would generate a TLB hit and access to the remaining 512 pages would generate TLB misses. Thus, for such applications serious performance problems occur.
In order to alleviate such performance problems it is desirable to increase the page size for certain applications. An example of such an application is a large database program. These programs typically have large working data sets comprising multiple database tables, indexes and the like.
Having a large page size defined for such applications can cause a significant increase to overall system performance. First, large size pages significantly reduces the number of page faults an application has to go through to initially get its working data set in memory. For example, if a system or virtual system (VM) supports a 1 MB page size, then an application having a 10 MB data size would just need 10 page faults instead of 640 (for a system with a 16 KB page size) to get its working set into memory.
Second, large size pages significantly reduces the number of TLB misses during program execution. For example, in a virtual machine that supports a 1 MB page size, an application having a 10 MB data size would require only 10 TLB entries for the entire 10 MB of memory. In contrast, for a VM that only supports a 16 KB page size, the entire TLB would be quickly exhausted (i.e. with the first 128 entries), thereby causing TLB misses for the remaining 512 pages.
Therefore, what is needed is a system, method and computer program product for dynamically allocating large memory pages of different sizes for processes that require them. Such large pages need to be allocated dynamically and as quickly as possible to avoid decreases in system performance.