This invention relates in general to the management of pages for improved performance of an application program module during hard page fault intensive scenarios. More particularly, the present invention relates to the reduction of hard page faults by pre-fetching pages into memory prior to the occurrence of a hard page fault sequence.
In a computer system, physical memory refers to a hardware device that is capable of storing information. In common usage, physical memory refers to semiconductor storage (RAM) that is connected to the processing unit of the computer system. Many modem processing units and operating systems also support virtual memory. Virtual memory is a technique that allows smaller and/or partially simulated memory devices to be represented as a large uniform primary memory source. In operation, application program modules access memory through virtual addresses, which are then mapped by the operating system in conjunction with a memory management unit (MMU) onto physical memory addresses.
In the context of a paging memory system, a xe2x80x9cpagexe2x80x9d is defined as a fixed-size block of bytes whose physical address can be changed via the MMU, working in conjunction with a Virtual Memory Manager. A page is either mapped onto a physical address or is not present in RAM, in which case it is stored on a disk storage in a page file. A xe2x80x9chard page faultxe2x80x9d is an exception that occurs when an application program module attempts to access a virtual memory page that is marked as being not present in RAM. When a hard page fault occurs, the Virtual Memory Manager must access disk storage to retrieve the data for the requested page.
Application program modules are typically disk-bound. In other words, disk access and transfer times are limiting factors of the performance speed of an application program module. Disk access time refers to the time required by a disk drive to access disk storage and respond to a request for a data read or write operation. Therefore, the performance of an application program module is significantly limited during hard page fault intensive scenarios.
There are various potential solutions to the performance bottleneck caused by disk access time during hard page fault scenarios. An obvious potential solution is to reduce disk access time. The reduction of disk access time is primarily a hardware consideration and is not easily accomplished. However, other potential solutions involve the manipulation of memory storage through software program modules.
For example, one prior solution involves manipulating pages such that related blocks of memory are stored together on the same or an adjacent page. More specifically, application program module code is typically stored in pages in the order in which a compiler processed the source code, not in the order in which it will be executed. Therefore, when a page is accessed by an application program module, it is likely that only a portion of the requested code is stored thereon and one or more hard page faults will occur to retrieve additional requested code from other pages. Manipulating the pages so that related code is stored on the same or adjacent pages reduces the number of pages required to execute the code and thus reduces hard page faults. Implementing this approach requires an extra per-application effort. Also, it is not always possible to manipulate code in pages in an efficient manner.
Another prior solution involves strategically ordering pages in disk storage. According to this prior solution, the order in which pages will likely be accessed during typical usage of an application program is determined based on the assumption that disk access patterns are similar from run to run. Then, pages are stored in disk storage in the determined order. A strategic ordering of pages will result in a reduction of hard page fault times. However, this approach is somewhat limited by the fact pages may be accessed more than once by an application program. Therefore, additional hard page faults may occur when a particular page must be re-retrieved from disk storage. Strategically ordering pages in disk storage tends to work best when it is employed to reduce hard page faults in a single hard page fault scenario, typically boot.
Another prior technique to reduce the performance bottleneck caused by disk access time during hard page fault scenarios involves decreasing the amount of pages associated with an application program module. Reducing the number of pages containing code executed by an application program module necessarily reduces the number of hard page faults that may possibly occur during execution of the application program module. However, the reduction of memory associated with an application program module requires significant effort on the part of the programmer, or improvements in compiler technologies, to streamline the application program module. Also, end-users demand application program modules having extremely robust functionality and complex graphics capabilities. Thus, it is becoming increasingly more difficult to streamline application program modules while meeting market demands.
Thus, there remains a need for a method and system for improving the performance of an application program module by reducing disk access time without burdening the programmer.
There further remains a need in the art for a method and system for reducing hard page faults during execution of an application program module without detracting from the robustness of the application program module.
The present invention meets the needs described above by providing a system and method for improving the performance of an application program module by reducing the occurrence of hard page faults during the operation of an application program module. The present invention may be embodied in an add-on software program module that operates in conjunction with the application program module. In this manner, no effort is required on the part of the application programmer to manipulate or modify the application program module in order to improve performance. Furthermore, the add-on software program module does not detract from the intended operation of the application program module.
In one aspect, the present invention is a method for avoiding hard page faults during the booting of an operating system of a computer system. Prior to booting the operating system, it is determined which pages will need to be retrieved from disk. When the operating system needs to be booted, the determined pages are loaded into a RAM of the computer system, whereby the determined pages will be available in the RAM and hard pages faults will not occur during the booting of the operating system. The step of determining which pages will be retrieved from disk may include creating a log of hard page faults that occur during the booting of the operating system, analyzing the log to find a common hard page fault scenario for booting the operating system, and determining from the log which pages were retrieved from disk during the common hard page fault scenario. A copy of each of the determined pages may be stored in a scenario file. Alternatively, a reference for each of the determined pages may be stored in a referenced scenario file.
As described above, the scenario file may be a referenced scenario file including a number of page references wherein each page reference includes a reference to section information (file name and whether the file is mapped as data or as an image) and a file offset for the referenced page. Alternatively, each page reference may include a physical disk sector for the page. The section information table that the page references refer to, is also stored in the scenario file.
In yet another aspect, the invention is a method for automatically detecting a hard page fault scenario. The start-up of an application program module is detected and the hard page fault scenario begins. It is determined if a scenario file exists. If not, then the application program module is run and a scenario file is created. If a scenario file already exists, then the pages in the scenario file are fetched into RAM and the application program module is run. When the application begins to run, an end scenario timer is started and soft page faults and hard page faults are logged. Each time a page fault is logged, the end scenario timer is reset. If the time period between two page faults is such that the end scenario timer reaches a predetermined threshold, then the hard page fault scenario is ended.
A queue may generate a work item to post-process the scenario file and scenario log. During idle time, the scenario file and scenario log may be post-processed. A scenario file may then be written to the disk space.
As part of post-processing the scenario file and scenario log, it may be determined which pages are part of the scenario log and not already in the scenario file. These pages are added to the scenario file. Scenario file entries corresponding to pages that were used during the scenario are updated to indicate that the page was used by the scenario. Scenario file entries for pages that have not been used for a predetermined number of times are deleted from the scenario file. The scenario file entries may then be sorted according to the section ID and file offset of each page represented by each scenario file entry.
In another aspect, the invention is a method for detecting a hard page fault scenario. The start-up of an application program module is detected and it is determined whether a scenario file exists. If a scenario file exists, then the pages in the scenario file are pre-fetched into RAM and the application program module is run. Any soft page faults or hard page faults are logged into memory. The hard page fault scenario may end when a Win32 hourglass cursor is replaced with a regular cursor.
In still another aspect, the invention is a method for building a plurality of memory descriptor lists (MDLs) for mapping to physical memory a plurality of pages referenced in a scenario file. It is determined whether each page referenced in the scenario file is already resident in physical memory and, if so, then these pages are discarded from consideration. For all pages not resident in physical memory, it is determined whether the file offsets for each pair of consecutive pages is below a predetermined level and if so, then the pages are put into the MDL. If the pages are not consecutive, the gap between the pages is plugged by inserting a required number of dummy pages into the MDL.