1. Field of the Invention
The field of the invention relates to data storage apparatus and in particular to data storage apparatus that receive multiple access requests per clock cycle.
2. Description of the Prior Art
Data processors process ever larger amounts of data that require significant storage capacity. Large data stores such as memory take time to access. Thus, techniques have evolved to store a subset of the data that may be required by a processor in smaller data stores such as low level caches that are easy to access, with access to one or more larger stores such as higher level caches or memory being made if data not within the subset is required.
The smaller data stores are made easy to access to improve processor speeds, however, they are costly to implement in power and area and it is therefore important that they store items that it is likely that the processor will require. If they do not store the required data then they simply add area and drain power without adding benefit. In effect the hit rate in these data stores is very important to processor power consumption and performance.
One example of such data stores are translation look aside buffers or TLBs. Most modern microprocessors with virtual memory have a virtually addressed, physically mapped data cache. Thus, all memory reads and writes need to have their virtual addresses translated to physical addresses before the addressed storage location can be accessed. This translation is typically done by a small translation cache called a TLB. If the TLB does not contain the requested translation, the translation information must be retrieved from a backing level 2 TLB or from memory management logic that accesses the page tables in memory. Hit rates in these TLBs are very important to processor performance. Even with a backing L2 TLB the penalty for TLB misses has a significant effect on overall performance.
Many modern microprocessors can issue two memory accesses per cycle, one load and one store and these therefore need two TLBs or a dual ported TLB to do the translations. A dual ported TLB has approximately the same area as the two TLB solution and may actually be implemented as two TLBs. The dual ported TLB has the disadvantage that in effect each entry is stored twice. The use of two independent TLBs which can each store different entries has the advantage of being able to store more entries than the dual ported TLB and can therefore provide better performance, in situations where the two TLBs are accessing different regions of memory.
However, there are several instances where load and stores will be to the same data items and if these have not been accessed recently then there will be a miss in both the load TLB and the store TLB in the two TLB implementation. Thus, there will be a time penalty for both of these accesses. In the dual ported TLB clearly the load TLB would have stored the data making it available for the store TLB.
Thus, these two implementations both have disadvantages.
It would be desirable to be able to provide a system with at least some of the advantages of both the dual ported and the independent storage mechanisms.