The disclosed embodiments of the present invention relate to accessing buffered data (e.g., cached data), and more particularly, to a method for controlling access of a cache through using a programmable hashing address and a related cache controller.
In today's systems, the time it takes to bring data into a processor is very long when compared to the time to process the data. Therefore, a bottleneck forms at the input to the processor. The cache helps by decreasing the time it takes to move information to and from the processor. When the cache contains the information requested, the transaction is said to be a cache hit. When the cache does not contain the information requested, the transaction is said to be a cache miss. In general, the hit rate is a critical performance index of the cache. How to increase the hit rate has become an issue in the field.
In general, the cache may be a fully associative cache, a direct-mapped cache, or a set-associative cache. The set-associative cache is a hybrid between the fully associative cache and the direct-mapped cache, and may be considered a reasonable compromise between the hardware complexity/latency of the fully associative cache and the direct-mapped cache. No matter which cache design is employed, there is a need to improve the hit rate. For example, when the cache size is 4 KB (Kilobytes), the cache is used to preload a 32×32 image from a 1024×768 image with 32 bpp (bit per pixel). In a linear address surface (image), an address offset from a pixel (X, Y) in a current scan line to a pixel (X, Y+1) in a next scan line is equal to a byte count of an image pitch of the image. Since the image pitch of the 1024×768 image is 1024, the byte count of the image pitch is 4 KB. Consider a case where the 1024×768 image is divided into a plurality of bins, each being a 32×32 image, and is processed in a bin-by-bin manner. The byte count of the image pitch is equal to the cache size, that is, the pitch of scan line is 4 KB which is just the size of cache capacity size 4 KB. If we didn't change the address mapping of the image to bank and set of the cache, all of the 32 scan lines will hit the same bank and the same set. For a direct mapping scheme, there is only one scan line of the 32×32 image that can be read into the cache. The next scan line of the 32×32 image will map to the same cache line and replace the current existing scan line in the cache. For a 4-way associative scheme, only 4 scan lines of the 32×32 image can be kept in the cache. The other 28 scan lines of the 32×32 image will map and try to replace the current 4 scan lines that contained data of the first 4 scan lines of the image in the cache. As a result, the miss rate will be high due to the fact that only a small number of scan lines of the 32×32 image can be kept in the cache. Further, the data preloading of the 32×32 image will be invalid because all of the scan lines in the 32×32 image cannot be kept in the cache at the same time.
Thus, there is a need for an innovative cache addressing design to preload most or all image data of a bin into a cache, thus improving the hit rate and reducing the data processing latency.