A processing speed of a processor (e.g., a central processing unit (CPU)) or a hardware engine (HWE) is generally greater than a data supply speed of a main memory, such as a dynamic random-access memory (DRAM). A cache memory may be used to make up a difference in performance.
A cache memory temporarily holds a piece of data on a main memory in a static random access memory (SRAM) or the like that is higher in speed than the main memory.
In the absence of a cache memory, a processor acquires a piece of data of per-access data size (e.g., 4 bytes) from a main memory. (The “per-access data size” is also referred to below as a “data size for access.”) In the presence of a cache memory, if a data array of the cache memory has no data, the cache memory acquires a piece of data from a main memory in units of cache line size (e.g., 256 bytes), which is larger than the data size for access.
If a demanded piece of data is present in a cache memory, the cache memory can return the piece of data from the cache memory to a processor without acquiring the piece of data from a main memory. For this reason, a processor or a hardware engine can access the data at high speed.
The capacity of a cache memory which can be integrated is limited. Accordingly, storage of data in compressed form has been proposed. A cache memory is manipulated in units of cache line size. A cache memory which stores data in compressed form decompresses a compressed cache line including a demanded piece of data and returns the demanded piece of data in the cache line to a processor.
As described above, a data size for access by a processor is smaller than a cache line size of a cache memory. To modify a piece of data of a whole cache line, writing is performed in a plurality of batches. For example, if a data size for access by a processor is 8 bytes, and a cache line size of a cache memory is 256 bytes, writing of a piece of 8-byte data is performed 32 times.
In the case of a cache memory which stores data in compressed form, to modify only a piece of data of part of an already-compressed cache line, the following decompression and compression processing of the whole cache line is necessary:
1) decompression of the whole target cache line;
2) writing of a piece of data in a target region in the target cache line; and
3) compression of the whole target cache line.
For this reason, to modify a piece of data of a cache line, decompression and compression processing of the whole cache line is performed for every data writing operation. In the example described earlier, decompression and compression processing of a cache line is performed for each of the 32 writing operations.
Thus, in the case of such a related-art cache memory, latency involved in decompression and compression processing of a cache line and power consumption of the cache memory are problems.