1. Field of the Invention
The present invention is related to the field of microprocessors and, more particularly, to data caches employed within microprocessors.
2. Description of the Related Art
Superscalar microprocessors attempt to achieve high performance by issuing/executing multiple instructions concurrently. To the extent that superscalar microprocessors are successful at issuing/executing multiple instructions concurrently, high performance may be realized. Several factors may influence the successful concurrent issue/execution of instructions. For example, a first instruction which is dependent upon a second instruction (e.g. for a source operand) generally does not issue/execute concurrently with the first instruction. Still further, the frequency of branch instructions (which determine which instructions will be fetched next from a variety of sources) may impact the number of instructions available for issue and hence the number of instructions issued concurrently.
In the continuing evolution of superscalar microprocessors, the maximum issue rate (i.e. the number of instructions which can be concurrently issued) has been increasing. In other words, a trend toward wider issue superscalar microprocessors has been occurring. While additional performance gains may be realized by allowing for larger numbers of instructions to concurrently issue, wider issue microprocessors may face additional design challenges as well.
Among the additional design challenges is providing sufficient data cache ports for the number of memory operations which may be concurrently issued. As used herein, the term "port", in connection with a cache, refers to a facility for accessing the cache in response to one memory operation. Other memory operations use other ports for accessing the cache concurrently. Superscalar microprocessors generally include data caches to decrease the latency of access to memory operands. Instruction sequences include a certain number of memory operations to access and/or update memory operands. Generally speaking, a memory operation specifies the transfer of data between the microprocessor and a memory external to the microprocessor (although the transfer may be completed via an internal cache). Load memory operations specify the transfer of data from a memory to the microprocessor, while store memory operations specify the transfer of data from the microprocessor to the memory. Memory operations may be explicit instructions, or an implicit part of another instruction specifying a memory operand, depending upon the instruction set architecture employed by the microprocessor.
As issue rates increase, the number of memory operations for which concurrent access to a cache is desired increases as well. If concurrent access is not provided (by providing sufficient data cache ports), then performance generally degrades. For example, many instructions are dependent upon load memory operations (either directly or indirectly) for source operands. Such dependent instructions typically cannot execute if the load memory operations are stalled due to a lack of available cache ports. Additionally, pipeline stalls may develop if subsequent memory operations attempt to issue prior to execution of prior memory operations and the available resources for queuing memory operations become full.
Various methods for multiporting data caches have been employed in the past. For example, the cache arrays may be physically multiported (allowing for concurrent access to any storage location within the array from each port in parallel with access to any other storage location from the other ports). Unfortunately, physically multiporting the array typically leads to large increases in the microprocessor chip area occupied by the array. The size of the chip is important to chip yields and number of chips per semiconductor wafer, and hence to the cost of producing the microprocessor. Accordingly, increase in the area occupied by a cache array is generally undesirable.
Another method employed to provide multiported cache access is to bank the cache. Each port may access one of the banks in parallel with a different port accessing a different bank. If two or more memory operations which would otherwise concurrently access the data cache actually access data within the same bank, one of the memory operations completes and the others are inhibited. Unfortunately, even with a large number of available ports, concurrent access to the data cache may not be achieved due to the occurrence of bank conflicts. Accordingly, a solution to multiporting a data cache which does not incur the disadvantages of physically multiporting the array or banking the cache is desired.