1. Field of the Invention
The present invention relates generally to increasing the processing speed of a microprocessor, and more particularly to a method and apparatus for increasing memory bandwidth by compression.
2. Description of Related Art
Improvements in microprocessor design have resulted in microprocessors having high clock speeds and the ability to execute multiple instructions per clock cycle. As the processing speed of a microprocessor increases, the microprocessor requires that program instructions and data also be supplied at a higher rate to optimize the use of the microprocessor's execution resources. If the rate at which program instructions (i.e., instruction bandwidth) or data (i.e., data bandwidth) are supplied is less than the processing speed (i.e., processing bandwidth) of the microprocessor, the microprocessor must wait for the information. Such idle time degrades overall system performance because the resources of the microprocessor are not optimally utilized.
When a microprocessor requests data, it first checks its internal cache memory. Program instructions are typically stored in an instruction cache (Icache) and data is typically stored in a data cache (Dcache). Hereinafter, the term data is intended to include both program instructions stored in the Icache and data stored in the Dcache. If the requested data is present in the cache the microprocessor can retrieve it quickly (on the order of a few clock cycles). If the data is not present in the cache, the main memory is checked. If the data is present in main memory, it is retrieved but a penalty is paid for missing the cache (data is retrieved from the main memory on the order of 100's of clock cycles). If the data is not present in main memory, it must be loaded from the hard disk or other storage device, and an even greater penalty is paid (on the order of 1000's of clock cycles).
Data compression algorithms have been used to try to reduce the penalties associated with accessing data stored on the hard disk and to reduce the space required to store files on electronic media including compact disc read only memories (CD ROM's), floppy disks, and hard disks.
A known compression program provides an interface between the main memory and the hard disk in an attempt to reduce the penalties associated with accessing the hard disk for data. Pages of data are read from the hard disk, and compression is attempted. A portion of the main memory is reserved for uncompressed data. This compression increases the apparent size of the main memory. If half of the main memory is reserved for compressed data and an average compression ratio of 3:1 is achieved, the apparent size of the memory is doubled. Pages of data are moved from the compressed portion to the uncompressed portion of the main memory when requested by the microprocessor. The decompression adds overhead because the data is not directly available in the main memory, but the overhead is less than the time required to access the hard disk if the data had not been stored in the main memory at all. This compression system reduces the penalty associated with missing the main memory and having to access the hard disk, but does not address the penalty associated with missing the cache and having to access the main memory for the data.
The first order equation for central processing unit (CPU) performance is: EQU Total CPU time=(Execution Latency .sub.perfect caches +Memory Access Latency)
Execution latency is the minimum time required to execute a task if the memory subsystem can be made perfect (i.e., instructions are ready for the processor in the cache when execution resources are available). Memory latency is the additional time required to access main memory if the data is not present in the cache. Because memory bandwidth and throughput have not improved at the same rate as improvements in microprocessor instruction execution rate, the Memory Access Latency term has become a larger percentage contributor to the first order CPU performance equation.
It would be desirable to reduce the contribution of the Memory Access Latency factor to the overall CPU performance equation, thus reducing the penalty paid for missing the cache.