Field of the Invention
Embodiments of the present invention relate generally to computer processing and, more specifically, to techniques for organizing memory to optimize memory accesses of compressed data.
Description of the Related Art
Some processing systems implement one or more data compression techniques to increase the effective memory bandwidth to attached memory devices, thereby improving overall performance. In such implementations, the processing system is configured to store certain blocks of data within the attached memory in one or more compressed formats that reduce the number of bytes used to represent each block of original data. Consequently, at any given time, the attached memory may include any number of compressed blocks of data and any number of non-compressed blocks of data.
In processing systems configured to store compressed data, the processing system typically allocates the number of bytes required to store a non-compressed block of data each time a request to write a block of data to attached memory is processed. If the processing system ends up writing a given block of data in a compressed format, then the processing system simply stores the compressed data in a portion of the memory allocated for the non-compressed version of the block of data known as a “compression atom.” Notably, each compression atom includes the number of bytes required to store the compressed version of the data block on the compression format implemented by the processing system.
In many processing systems, the number of bytes in the compression atom is configured to match the number of bytes that the processing system transmits to or from the attached memory as part of performing, respectively, a write or read operation. As referred to herein, a “memory atom” associated with the attached memory is the data that the processing system transmits to and from the attached memory. Consequently, each memory atom includes the number of bytes that the processing system transmits to or from the attached memory. Such a set-up allows the processing system to fully utilize the memory bandwidth between the processing system and the attached memory when performing compressed data accesses. For example, a processing system could support a 32 byte compression atom and a 32 byte dynamic random-access memory (DRAM) atom. To read a compressed block of data, the processing system would retrieve a 32 byte DRAM atom that includes 32 bytes of compressed data from the DRAM. Accordingly, in such a scenario, the memory bandwidth between the processing system and the DRAM is fully utilized, and the overall performance of the processing system is optimized.
By contrast, in some processing systems, the size of the compression atom may not match the size of the memory atom associated with an attached memory because the size of the compression atom and the size of the memory atom associated with an attached memory may each be individually optimized based on different technologies that evolved over different time frames. When the size of the compression atom and the size of the memory atom associated with an attached memory differ, the processing systems typically cannot fully utilize the available memory bandwidth when performing compressed data accesses, which decreases the overall performance of the processing system. For example, a processing system could support a 32 byte compression atom and a 64 byte DRAM atom. To read a compressed block of data from the attached memory, the processing system would retrieve a 64 byte DRAM atom that includes the 32 bytes of compressed data as well as 32 bytes of meaningless data. Accordingly, half of the memory bandwidth between the processing system and the attached memory is wasted.
As the foregoing illustrates, what is needed in the art is a more effective approach to managing accesses to compressed data in memory.