1. Technical Field
The present invention relates generally to a method, system, and computer program product for memory management. More particularly, the present invention relates to a method, system, and computer program product for improving memory utilization of sparse pages.
2. Description of the Related Art
Data processing systems include memory devices for storing, processing, and moving data. A memory device, or physical memory, is generally a physical component of a data processing system configured to store data. Overall memory in a data processing system may also include logical components, such as a space on a hard disk designated to be used as a part of the system's memory.
A data processing system includes a set amount space in the physical memory. An operating system allows applications, processes, and threads (collectively, process) to access a portion of that physical memory for performing their functions.
Physical memory is addressed using physical addresses that point at locations in the physical memory. The physical addresses belong to a physical address space configured in the data processing system. A virtual address is an address that has to be mapped to a physical address to access the data stored in the location corresponding to the physical address.
A process executing in the data processing system does not reference the physical memory using physical addresses. The process can only use virtual addresses from a virtual address space that is specified and configured for use by the process. Other processes similarly use virtual addresses from other virtual address spaces to access physical memory.
The virtual address to physical address mapping allows an operating system, or a memory management subsystem thereof, to offer more memory in virtual form to the processes that execute in the data processing system than is physically available in the data processing system. Furthermore, the virtual address to physical address mapping allows an operating system, or a memory management subsystem thereof, to share some memory space amongst processes where the processes share common data, and keep the processes' individual data separate from other processes.
A page-size is a size of data that is read or written together into memory. When a process changes even a bit in a page, the entire page is deemed to have changed. When a process requests even a byte of data within a page the entire page has to be read from memory. If the page of the requested data is not available in memory, the memory management subsystem brings the entire page into memory from a secondary data storage unit, such as a hard disk drive, via a mechanism called page fault.
A commonly used page-size is 4 kilobytes (KB), which was established in the early days of computers, when physical memory available in computers was of the order of KB or megabytes (MB), significantly smaller than physical memories being configured in presently available computing systems. For example, presently, data processing systems having gigabytes (GB) of physical memory are commonplace, and systems with terabytes (TB) of physical memory are not uncommon. Modern operating systems allow addressing using addresses that are 64 bits long, allowing for pages that can be larger than 4 GB.
Pages or page frames of up to 4 KB are called small frames. Pages of size larger than 4 KB are called large frames. For example, some presently available data processing systems allow frames of 16 MB, which are four thousand times larger than the 4 KB small frames.
A process requests a page from a heap when the process needs memory to read or write data. A page in the heap is a virtual page. The size of virtual pages is determined by a configuration in the kernel, such as by a frame size parameter in the kernel. The virtual page maps to a physical page in physical memory via a page table. A process reads or writes data in the virtual page. The data is actually read or written in a physical page via the virtual page-physical page mapping in the page table.