1. Field of the Invention
The present invention relates to computer systems implementing main memory compression and, particularly to a virtual uncompressed cache for reducing latency which may be incurred due to the delays associated with decompressing each line (or set of lines) in compressed main memory systems.
2. Discussion of the Prior Art
In computer systems with relatively large amounts of main memory, the memory cost can be a significant fraction of the total system cost. This has led to the consideration of computer system architectures in which the contents of main memory are maintained in a compressed form. One approach to the design of such an architecture is to compress memory using cache lines, or sets of cache lines, as the unit of compression: cache lines are maintained in an uncompressed form in the cache; compressed before storage from the cache to main memory; and decompressed on cache misses which result in reads from main memory. A number of problems must be solved in order to make such an approach effective. For example, cache lines stored in main memory without compression are of a uniform fixed size; however, using memory compression, they now occupy varying amounts of space. A number of techniques for efficiently storing and accessing variable size cache lines in main memory are now known, for example, as described in commonly-owned, U.S. Pat. No. 5,761,536 to P. Franaszek entitled System and Method for Reducing Memory Fragmentation by Assigning Remainders to Share Memory Blocks on a Best Fit Basis; and, as described in commonly-owned, U.S. Pat. No. 5,864,859, to P. Franaszek entitled System and Method of Compression and Decompression using Store Addressing.
Another problem is that of additional memory access latency which may be attributed to two sources as compared to systems that do not use compression: First, since compressed lines occupy varying amounts of space, the space used to store compressed lines must be allocated dynamically, and the location of compressed lines found using a directory. Although such directory access may introduce additional latency, this additional latency may be reduced by maintaining a cache in fast memory of directory entries, as described in commonly-owned, co-pending U.S. patent application Ser. No. 09/256,572, entitled xe2x80x9cDirectory Cache for Indirectly Addressed Main Memoryxe2x80x9d, to C. Benveniste, P. Franaszek, J. Robinson, C. Schultz, the contents and disclosure of which is incorporated by reference as if fully set forth herein; the second source of additional memory access latency is due to compression and decompression. That is, due to compression, additional time may be required (as compared to conventional systems in which main memory contents are not compressed) for reading data from main memory, since in general each such read requires decompressing a cache line or set of cache lines. Similarly, storing data to main memory may require additional time due to compressing the data. If sets of cache lines are used as the unit of compression, stores to main memory may also involve decompression: in order to store one cache line, it may be necessary to first decompress the remaining cache lines (read from main memory) associated with the line being stored, and only then compress and store the set of lines as a unit. Techniques for speeding up compression and decompression by using parallelism together with shared dictionaries, may be found in commonly-owned U.S. Pat. No. 5,729,228, to P. Franaszek, J. Robinson, J. Thomas entitled PARALLEL COMPRESSION AND DECOMPRESSION USING A COOPERATIVE DICTIONARY and may alleviate this problem. Nevertheless, there may still be a performance degradation due to increased memory access latency as compared to systems that do not use main memory compression.
One approach to solving the problem of increased memory access latency due to compression and decompression is to set aside a partition of main memory in which recently accessed cache lines (or sets of cache lines) are stored in uncompressed form. An example of such an approach, in which the partition is referred to as the xe2x80x9cUncompressed Cachexe2x80x9d, is described in commonly-owned, U.S. Pat. No. 5,812,817, to Hovis, Haselhorst, Kerchberger, Brown, Luick and entitled COMPRESSION ARCHITECTURE FOR SYSTEM MEMORY APPLICATION.
The approach of partitioning main memory has, however, a potentially serious drawback: each cache line (or set of cache lines) stored in the uncompressed partition is stored twice in main memory: once in compressed form, and a second time in uncompressed form in the partition. Thus, this approach can reduce the overall effectiveness of memory compression due to storing a number of lines twice.
In most current computer operating systems that use virtual memory, the unit of main memory storage allocation, as well as the basis for translating virtual addresses to real addresses, is a page: virtual addresses are translated to real addresses using page tables (also stored in main memory). Using a page as the unit of compression, another approach in which main memory is partitioned into an uncompressed region and a compressed region, but without duplication of pages (i.e., each page is either stored compressed or uncompressed but not both), is described in the reference xe2x80x9cPerformance Evaluation of Computer Architectures with Main Memory Data Compressionxe2x80x9d, by M. Kjelso, M. Gooch, S. Jones, Journal of Systems Architecture 45 (1999), pages 571-590.
Such an approach has the disadvantage that typical page sizes are relatively large (e.g. 4096 bytes) with respect to compression and decompression times; thus, a design of this type introduces excessively large memory access latencies when there is a cache miss that maps to a page stored in the compressed partition.
It would be thus highly desirable to avoid this problem by providing a computer system implementing compressed main memory that employs a smaller unit of compression, i.e., each page is divided into a number of segments, each of which can be compressed and decompressed independently, in order to reduce excessive memory access latency (as compared to conventional non-compressed memories) that may be incurred due to the delays associated with decompressing a page of main memory whenever it is read from memory, and from compressing a page whenever it is stored to main memory.
It would also be desirable, for a reduced processing time and for more efficient use of memory, not to partition memory, that is, to manage the storage used for all segments, both compressed and uncompressed, in a uniform way.
It is an object of the present invention to provide a computer system architecture implementing compressed main memory that employs a smaller unit of compression, e.g., a page is divided into a number of segments, each of which may be compressed and decompressed independently, in order to reduce excessive memory access latency. For example, for a typical page size of 4096 bytes, each page may be divided into four (4) segments of 1024 bytes each, effectively reducing memory latency associated with decompression in the case of a cache miss that maps to compressed main memory data by an approximate factor of four (4) as compared to using entire pages as the unit of compression.
It is a further object of the invention to provide a computer system architecture implementing compressed main memory for storing both compressed and uncompressed data segments, each of which may be processed independently and in a uniform manner in order to reduce excessive memory access latency.
It is another object of the invention to provide a computer system architecture as in the previously described objects of the invention that includes a virtual uncompressed cache for storing uncompressed data segments and a virtual uncompressed cache management system for tracking previously accessed uncompressed data segment items from main memory. That is, rather than a partitioning memory scheme where data segments are stored in compressed or uncompressed regions, all data segments are handled uniformly, with the blocks used to store the set of uncompressed segments comprising the virtual uncompressed cache randomly located in memory.
According to the preferred embodiment of the invention, there is provided a system for reducing data access time in a computer system with main memory compression, in which the unit of compression is a memory segment, the system comprising a common memory area for storing uncompressed data and compressed data segments; directory means stored in main memory having entries for locating both uncompressed data segments and compressed data segments for cache miss operations, each directory entry further indicating status of the data segment; control means for accessing said directory entries and checking status indication of a data segment to be accessed for the cache miss event, and enabling processing of the data segment from the common memory area according to the status indication, whereby latency of data retrieval is reduced by provision of uncompressed data segment storage in the common memory area.