1. Field of the Invention
This invention relates to microprocessors and, more particularly, to emulation of complex instructions by microcode, and still more particularly, to caching of memory used during such emulation.
2. Description of the Related Art
While it is desirable for microprocessors to maintain compatibility with a complex instruction set computer (CISC) architecture, other architectures offer improved execution speed and performance. Microprocessor designers have attempted to achieve both CISC compatibility and high performance by emulating CISC instructions. For example, superscalar, reduced instruction set computer (RISC) architectures may include microcode that performs CISC instruction emulation. During the emulation process, microcode makes use of a scratchpad memory for saving intermediate values. To maintain high performance, it is desirable for a microprocessor's microcode to be able to access the emulation memory as quickly as possible.
In addition, microprocessors commonly include multiple memory caches, arranged hierarchically and shared by multiple cores or execution units. A variety of caching architectures are used and include various combinations of on-chip cache and off-chip cache. Memory operations that read data from cache or memory may be referred to more succinctly herein as “loads”. Memory operations that write data to cache or memory may be referred to more succinctly herein as “stores”. A load or a store may target a particular cache line (or portion of a cache line) and include an address identifying the targeted line as well as including data to be loaded from or stored within the cache line. Since cache accesses are faster than memory accesses, various caching techniques are used to increase the likelihood that data is located in a cache when a core or execution unit needs to access it, thereby improving execution speed. Consequently caching the microcode emulation memory offers the performance advantage of the relatively faster access time of cache memory compared to system memory. The shortest access times are generally those associated with the lowest level of the cache hierarchy, commonly referred to as L1-cache, or simply L1. Therefore, it is desirable to cache the microcode emulation memory in L1. Such performance advantages have often been reinforced by the permanent allocation of a portion of L1 for microcode emulation memory.
Of course, the performance advantages of using the L1-cache would benefit other processes as well. Consequently, it is desirable to make the L1-cache as large as possible to increase the availability of L1-cache space for any process. However, increasing the size of L1 increases the cost and complexity of the microprocessor. Also, if the microcode emulation memory is permanently allocated in L1, this portion of L1 is not available to other processes. In order to address the above concerns, what is needed is a way to improve availability of space in a given size L1-cache to all processes while maintaining the advantages of caching the microcode emulation memory.