1. Technical Field
The present invention relates generally to an improved data processing system. More specifically, the present invention is directed to a method, apparatus, and computer program product for the page-out and page-in of stale objects stored in memory in a data processing system.
2. Description of Related Art
Memory is a data processing system's workspace. Physically, memory is a collection of Random Access Memory (RAM) computer chips. Memory is an important resource for the data processing system, since memory determines the size and number of programs that can be run on the data processing system at the same time, as well as the amount of data that can be processed.
All program execution and data processing takes place in the data processing system's memory. A program's instructions are copied into memory from a disk, a tape, or from a network. Then, the program instructions are extracted from memory by a processing unit for analysis and execution. Memory is such an important resource to the data processing system that it cannot be wasted. Memory must be allocated by the operating system (OS), as well as by the programs, and then released when not needed.
The OS is the master control program that runs the data processing system and is the first program loaded when the data processing system is turned on. Common operating systems may include IBM mainframe OS/390 and the AS/400's OS/400, the many versions of Windows (95, 98, NT, ME, 2000, and XP), versions of Unix (Solaris and Linux), and the Macintosh OS. The OS, which resides in memory at all times, starts and communicates with all programs that run in the data processing system.
Also, the OS controls program address space allocation and extends the physical RAM by implementing a virtual memory on a physical hard disk. The virtual memory created by the OS will be known as an OS page file. The virtual memory page file temporarily stores objects of a program on the hard disc when there is not enough physical memory to hold all the programs. Paging is the movement of an object between physical and virtual memory in the data processing system to optimize performance without the user being aware that the transfer has taken place.
Program address space is the portion of memory used by a program when running. The program address space may refer to physical memory or virtual memory or a combination of both. A program running on the OS is comprised of a set of instructions. The program's set of instructions needs to allocate memory from within the program address space the OS has assigned to the program. If the program continues to allocate memory for data buffers and eventually exceeds the physical memory capacity, the OS then has to place parts of the program in virtual memory on hard disk in order to continue, which slows down processing.
Objects are the basic software building blocks of object-oriented programming. Objects are a collection of variables, data structures, and procedures stored as a self-contained entity. Java is an example of object-oriented programming language designed to generate programs that can run on all hardware platforms, small, medium and large, without modification. Developed by Sun Microsystems, Inc., Java has been promoted and geared heavily for the Web, both for public Web sites and intranets. Java is not compiled into machine language for a specific hardware platform, it is compiled into an intermediate language called “bytecode.” The bytecode program may be run in any hardware that has a Java Virtual Machine (JVM) runtime program available for it.
JVM is a Java interpreter. The JVM is software that converts the Java intermediate language (bytecode) into machine language and executes it. This means Java programs are not dependent on any specific hardware and will run in any computer with the JVM software. Java was designed to run in small amounts of memory and provides enhanced features for the programmer, including the ability to release memory when no longer required. This automatic “garbage collection” feature has been lacking in previous programming languages and has been the bane of programmers for years. Garbage collection is a software routine that searches memory for areas of inactive data and instructions in order to reclaim that space for the general memory pool (the heap).
C programming language, for example, does not do automatic garbage collection, which requires that the programmer specifically deallocate memory in order to release it. Deallocating memory after a routine no longer needs it is a tedious task and programmers often forget to do it or do not do it properly. Java performs automatic garbage collection without programmer intervention, which eliminates this coding problem.
A Java heap refers to a common pool of physical memory that is available to the Java program. The management of the heap is done by the Java program in allocating objects as required and by the garbage collector in deallocating objects which are no longer needed. Currently, Java operates in such a manner that some types of data loaded during program execution will remain in physical memory for the life of the JVM.
Static data and cache data, which are seldom referenced by program instructions, are examples of stale data that remains in physical memory for the life of the JVM. Static data and cache data will remain in the Java heap and will be continually referenced by the garbage collector. This continual referencing of the static and cache data insures that they will remain in physical memory because the OS will not page-out data that is continually referenced. Also, the static and cache data will increase the CPU overhead (amount of processing time used) during garbage collection cycles because the entire physical memory heap has to be traversed to determine if the data can be garbage collected. The increased overhead decreases processor performance and productivity in executing program instructions.
Another problem with retaining the static and cache data in the Java heap is that the amount of free physical memory space available for program instruction execution is decreased. Decreased heap space in physical memory requires that the garbage collector run more frequently to free unused memory. As a result, more CPU cycles are “burned” in memory management rather than program instruction execution.
Therefore, it would be advantageous to have an improved method and system for the page-out and page-in of seldom referenced objects (stale objects) stored in the memory of a data processing system.