1. Technical Field
The present invention relates to the field of address translation, and in particular, to systems for translating between linear or logical addresses and physical addresses.
2. Background Art
The frequencies at which central processing units (CPUs) operate have increased significantly in the last few years. It is expected that processors operating in the gigahertz range of frequencies will be available in the near future. However, the computing power of a processor is not determined solely by the frequency at which its core components operate. CPUs operate on data according to instructions, and both the data and instructions (hereafter "data") must be provided to the CPU from various memory structures. The full computing power of a CPU is available only when data can be supplied to the processor at speeds sufficient to keep the CPU busy at its operating frequency.
A number of strategies have been developed to reduce or eliminate bottlenecks in the data paths between the CPU and its various memory devices. These strategies include the use of one or more caches to maintain data and instructions in close proximity to the processor core. High performance CPUs typically have at least one cache, e.g. an L0 cache, located on the same chip as the processor core. L0 caches tend to be relatively small to provide rapid access to the data they store and to limit the amount of the CPU chip devoted to data storage.
In addition to L0 caches, many processor include higher level caches (L1, L2 . . . ), one or more of which may be located on the CPU chip as well. Here, L1 identifies the next cache in the memory hierarchy after the L0 cache. The L1 cache is searched for requested data not found in the L0 cache. Similarly, the L2 cache follows the L1 cache in the memory hierarchy. It is searched for requested data not found in either L0 or L1. Higher level caches may be present in some computer systems.
Caches tend to be larger the further they are from the CPU core, e.g. the higher their level in the memory hierarchy. The larger size accommodates the data held in any lower level cache(s) as well as additional data not available in lower level caches. The larger size of higher level caches increases the time required to access the data they store, because larger amounts of data must be sorted and more gates contribute to the capacitive loading of the cache circuitry.
Because the L0 cache is integrally coupled to the CPU core, data in the L0 cache is often addressed using the linear (logical) addressing scheme employed by the CPU core. However, data in higher level caches is typically addressed using a physical addressing scheme that reflects the structure of main memory. Consequently, data addresses must be translated from their linear to their physical address forms when the data is sought from higher level caches. Since address translation adds another step to the data retrieval process, high performance processors often include translation lookaside buffers (TLBs) for their higher level caches. A TLB stores recently translated physical addresses that are indexed according to portions (tags) of their corresponding linear addresses. If a linear address tag is present in the TLB, the data can be retrieved from the cache using the already-translated physical address tag associated with the linear address tag.
In addition to bits representing a linear address and a corresponding physical addresses, each TILB entry typically includes bits that indicate status information for the physical address. This information includes, for example, an indication as to whether the data is still valid, the memory type of the data, paging bits, and the like.
TLBs frequently are larger the larger the size of the cache they serve, since the TLB must hold correspondingly more entries. Larger TLBs have slower data access times for reasons similar to those for larger caches. In faster processors, valuable clock cycles may be consumed finding an entry in the TLB and retrieving the associated data. There is thus a need for a memory system that accommodates rapid address translation, especially for lower level caches.