1. Field of the Invention
The present invention relates generally to data compression, such as implemented in memory systems of computing devices, and more particularly, to an improved memory controller and associated memory management system for effectively increasing the amount of data that can be stored in the main memory of a computer.
2. Description of the Prior Art
Traditionally, as shown in FIG. 1, a computer system 10 including a processor device (CPU) executes an Operating System (O/S) 12 natively on the computer hardware that is adapted for executing basic computer system functions in addition to controlling execution of one or more programs or processes. The computing system further includes two types of memory: a ‘real’ memory 15 that comprises the available memory that a processor, O/S and memory management hardware actually sees and addresses, and, an actual physical memory 20 (e.g., IC chips plugged into the computer) that is of a fixed size. It is understood that real memory is backed (mapped) onto part of the physical memory, and although not shown, and may be part backed by non-volatile storage media, (e.g., one or more hard disk drives) allowing virtual memory size to exceed real (physical) memory size. A process executing in the computer system 10 will thus have an associated real address space 15 that is the logical view of how a process is stored in memory.
As further shown in FIG. 1, the O/S and memory management hardware includes a memory controller 30 providing an indirection mechanism comprising a translation table or like device implementing address mapping function. For example, a translation table 25 is used to map one or more real memory blocks “Block A” and “Block B” as shown in FIG. 1, to respective physical memory blocks 21A and 21B in a one-to-one correspondence. As shown in the embodiment depicted in FIG. 1, thus two physical memory blocks are mapped to two more real memory blocks.
Thus, in view of FIG. 1, for purposes of discussion herein, it is apparent that the unit of access to memory is a block. The addresses provided to the memory controller including translation table 25, are called real addresses. A real address (or a real block number) is used to access a real memory block. The memory controller uses the indirection mechanism to store data in physical memory and, it is noted that a real memory block may get mapped to one or more physical memory blocks. Physical memory blocks are accessed by using physical memory addresses (or physical block numbers). As implied, a physical memory block can be the same size or smaller than a real memory block.
The use of physical memory thus requires implementation of the memory management hardware implementing a memory mapping unit or like translation table 25 that maps program addresses or pages to corresponding physical memory addresses or pages in physical memory. It is a function of the O/S 12 to ensure that the data and process a program is currently using is resident in real physical memory.
In prior art literature, particularly, in U.S. Patent Pub. No. 2007/0038837, there is disclosed a system for identifying at least two virtual pages that store identical data; causing each of the at least two virtual pages to correspond to one shared physical page, where said shared physical page stores identical data; and that services a memory request comprising an access of one of said virtual pages by accessing the shared physical page. This system enables multiple virtual addresses to map to the same physical location in memory if it has been determined that they are all intended to access the same data. Virtual addresses are identified and correspondence information (such as from a translation table) is changed in order to ensure that they all correspond to the same physical location, thus freeing up memory. The identification process may examine most commonly used pages, may use hash functions to create signatures for pages and compare those signatures. A count is maintained as to how many virtual addresses currently correspond to the location address, and where, if that number is greater than one, the location is set to be read-only.
Further, U.S. Patent Pub. No. 2007/0050423 discloses an intelligent duplicate management system for identifying duplicate electronic files that implements a hash-sieve process that expresses many existing approaches to duplicate detection. The first hash function could be, for example, the size of the block, and the resulting buckets are, hence, the groups of same-size blocks. The next hash function could be the identity, in which case a byte-to-byte comparison is performed, and the resulting buckets are then the groups of identical blocks, hence, indicating the groups of duplicate files.
A further prior art solution, found in the reference to Kulkarni, et al., entitled “Redundancy Elimination Within Large Collections of Files”, Proceedings of the USENIX 2004 Annual Technical Conference, June 2004, discloses Redundancy Elimination at the Block Level (REBL) which leverages the benefits of compression, duplicate block suppression, and delta-encoding to eliminate a broad spectrum of redundant data in a scalable and efficient manner. REBL uses super-fingerprints, a technique that reduces the data needed to identify similar blocks while dramatically reducing the computational requirements of matching the blocks: it turns O(n2) comparisons into hash table lookups.
A further solution, found in the reference to Carl A. Waldspurger entitled “Memory Resource Management in VMware ESX Server”, in Proceedings of the 5th Symposium on Operating Systems Design and Implementation Boston, Mass., USA Dec. 9-11, 2002 describes a complete software solution for identifying page copies by their contents. That is, in the “Vmware” product described, pages with identical contents can be shared regardless of when, where, or how those contents were generated by malting use of a hash value to summarize a page's contents and that is used as a lookup key into a hash table containing entries for other pages that have already been marked copy-on write (COW). VMware studies show a sharing (duplication) percentage of up to 67% among Virtual Machines.
It would be highly desirable to provide a system and method for detecting duplications of main memory content and eliminating them in order to be able to store a larger amount data in main (physical) memory. Moreover, it would be highly desirable to provide a system and method that performs the duplication detection and elimination solely in hardware and without imposing any penalty on the overall performance of the computing system.