1. Field of the Invention
The present invention relates to a cache, and more particularly to a cache line allocation system and method of a three-dimensional graphic shader.
2. Description of Related Art
In ordinary electronic systems, the access speed of the system memory is much slower than the clock speed of CPU. Therefore, it always takes a lot of time waiting for the system memory when the CPU is accessing the system memory. This makes the whole system ineffective. In order to improve the system efficiency, a cache architecture is proposed. With this architecture, a small capacity cache is implemented, such as static random access memory (SRAM), to store the most recently accessed information of CPU. When the requested information has already been stored in the cache, CPU could read the information more quickly from cache instead of from the system memory.
There are two types of information stored in the cache which are data and instructions. In most applications, there is always a large amount of data and a small amount of instructions. The advantages of the cache are more outstanding with instructions than data, because the amount of instructions is always much smaller than the amount of data, and the instructions are more frequently read than the data. Further, for the graphic processing mechanism such as three-dimensional graphic shader, the processing of different pixel data always uses the same instruction group, so the cache architecture is more significant to this kind of graphic processing.
However, the capability of cache in ordinary system is not large enough to store the whole instruction group in one writing operation, which means it is impossible to read the whole instruction group in one reading operation but has to read and write repeatedly. Therefore there would not be any advantages of using the cache. For example, assume that the size of instruction group is 1-instruction-length longer than the size of cache. When processing the first pixel, the instruction group must be read from the system memory and written into the cache. As there is no room for the last instruction, the graphics processing unit must read the system memory again for that last instruction and rewrite the cache to make room for the last instruction. This must result in overwriting another instruction. When processing the second pixel, as the cache has been rewritten, not all the instructions are in the cache when doing the hit determination mechanism, so the whole instruction group must be read from the system memory again. In other words, the graphics processing unit cannot find all the instructions in the cache every time it is processing another pixel, so it must access the system memory more than once to read the whole instruction group in every processing, therefore the above-mentioned advantage becomes a disadvantage.
Therefore, there is a need for a novel cache line allocation system and method to improve the utility efficiency of the cache and the system memory.