The present invention generally relates to the field of cache design and implementation. More particularly, the present invention relates to a method and apparatus for automatically selecting and implementing optimal parameters for a cache associated with a computer processor by analyzing the type of instruction code executed by the processor.
Problem Addressed
It is well known that computers (e.g., mainframes, PC""s, servers, etc.) generally include a Central Processing Unit (xe2x80x9cCPUxe2x80x9d) which functions to fetch and execute programs and manipulate data in an associated memory. Thus, a CPU""s associated memory typically contains executable instructions and data processed during program execution.
In general, the speed of the CPU (i.e., the time it takes to perform one instruction) is faster than the access speed of the CPU""s main memory. Therefore, the CPU may idly wait for a requested instruction or data while the CPU""s main memory cycles through a memory fetch operation.
In an effort to address this problem, hardware developers have, over time, employed memory configurations organized in hierarchical fashion. Specifically, memory hierarchies are organized as levels, with faster memories nearer the CPU and larger, slower memories away from the CPU. Using such a hierarchy, a developer can greatly improve access time by ensuring data most often accessed by the CPU is contained in the memories providing the fastest access. These memories are often referred to as xe2x80x9ccachexe2x80x9d memories.
While such xe2x80x9ccachexe2x80x9d memories necessarily improve processing efficiency and performance, developers desiring optimal performance must also properly determine various cache design parameters (e.g., cache size, number of cache sets, etc.)
For example, because cache memories contain relatively less data than main memory, the CPU may experience a xe2x80x9ccache missxe2x80x9d during processing (i.e., a request for particular data or instructions not currently stored in the cache). In that event, the requested data or instructions must be retrieved from main memory, stored in cache memory and supplied to the CPU. As a result, processing efficiency is diminished.
One way of minimizing the number of cache misses is to properly select a cache design parameter referred to as the xe2x80x9ccache replacement policy.xe2x80x9d The cache replacement policy is the strategy for selecting data to be removed form the cache when new data must be stored in the cache (e.g., during a xe2x80x9ccache missxe2x80x9d). Preferably, the replacement policy (e.g., random replacement, least recently used replacement, etc.) is optimized such that data most frequently used by the CPU going forward is stored in cache memory.
Unfortunately, optimizing the cache replacement policy is not an easy task, particularly insofar as the developer does not know in advance what types of applications, and therefore what types of data or instructions, the CPU will process during code execution.
The present invention addresses this problem. In particular, in accordance with the present invention, before a processor executes any program code, an algorithm is first executed, on the same processor or another, which evaluates the code, and based on that evaluation provides outputs representing optimal caching parameters. In evaluating the code, the algorithm uses a set of code characteristics the presence of which, or the degree of presence of which, impact the proper selection of cache parameters. Once the code characteristics have been selected, the algorithm uses, for example, boolean truth state tables or fuzzy logic membership set definitions to specify whether and when each code characteristic is true or false (i.e., present or absent) for a given set of code instructions. The algorithm also uses a set of implications or rules (e.g., fuzzy or boolean logic rules) which intelligently relate the presence or absence of the code characteristics to outputs representing various cache parameter choices. Once the code characteristic (membership) truth functions and the rule base have been defined, the algorithm evaluates the truth or falsity of the code characteristic truth functions and, via the logic rules, propagates those values to conclusions regarding the proper selection of cache parameters. If necessary, the conclusions for each rule are weighted and, possibly, discretized to obtain crisp, meaningful decisions regarding cache parameters (e.g., decisions regarding the level of set associativity of the cache, the cache replacement policy etc.)
Once the cache parameter decisions are obtained, they are implemented and the cache associated with the processor is accordingly altered prior to, or contemporaneous with, execution of the source code.
Accordingly, one advantage of the present invention is that it improves the efficiency and speed of memory retrieval operations in computers with a processor, memory and associated cache.
Another advantage of the present invention is that it simplifies the organization and design of cache memories by automating intelligent decision making regarding cache parameters.
Further advantages of the present invention will become apparent from the more detailed description below.