1. Technical Field
The present invention relates in general to cache controllers in data processing systems and in particular to cache controllers which layer cache and architectural specific functions. Still more particularly, the present invention relates to layering cache and architectural specific functions within a cache controller to permit controller functions to be implemented as distinct, substantially autonomous functional units which may be efficiently replicated or removed to produce multiple designs with different cost/performance characteristics.
2. Description of the Related Art
Data processing systems which utilize a level two (L2) cache typically include a cache controller for managing transactions affecting the cache. Such cache controllers are conventionally implemented on a functional level, as depicted in FIG. 4. For example, a cache controller 402 may include logic 404 for maintaining the cache directory, logic 406 for implementing a least recently used (LRU) replacement policy, logic for managing reload buffers 408, and logic for managing store-back buffers 410. In traditional implementations, the cache is generally very visible to these and other architectural functions typically required for cache controllers, with the result that cache controller designs are specific to a particular processors such as the PowerPC.TM., Alpha.TM., or the x86 family of processors.
The basic cache controller design depicted in FIG. 4 is difficult to rework in order to produce new designs. The controller's functional nature gives rise to a complex set of interrelated logic which is difficult to reconfigure. Often it is simpler to start from scratch rather than attempt to modify and existing design to alter performance. Resources are tightly coupled and may not be added or removed from the design in a straightforward manner to alter the controller performance. In addition, the complex logic significantly restrains the maximum frequency achievable for the design.
Until recently, little need has been perceived for cache controllers of similar designs but differing price/performance characteristics. In particular, two largely distinct classes of data processing systems have arisen in the field: servers and clients. Servers are typically systems intended to provide data and services in a overall larger network of computers and are used simultaneously by a number of users. In contrast, clients are typically desktop systems used by a single user.
For servers, performance is a more driving concern than cost, with price being a significantly less important consideration. For clients, the critical factor is more often price than performance. Even within this overall grouping of servers and clients, however, there still has arisen widely varying price/performance needs.
Cache controller performance is dominated by three distinct issues: clock speed, cache size and number of simultaneous operations supported. Clock speed determines the overall rate at which operations are serviced by the cache and the system as a whole. Cache size affects the hit/miss ratio for the cache, which determines the cache's overall effectiveness in providing data to the processor without resort to retrieving data from the (typically slower) system memory or lower level cache. The number of outstanding operations supported by the cache controller affects cache performance by the number of operations which may be serviced, on average, per unit of time. Most modern data processing systems employ a well known feature--a pipelined split transaction bus--in both the system and processor buses. This feature is specifically intended to allow multiple outstanding operations at any given time. By increasing the resources available in the cache to permit more outstanding operations to be maintained at a given time, a higher overall performance can be achieved.
It would therefore be desirable to implement a cache controller permitting efficient removal or addition of resources to implement multiple cache controller designs having varying price and performance characteristics. It would further be advantageous to provide a resource structure supporting higher clock frequencies, such that even the least expensive design operates faster.