1. Field of the Invention
The present invention generally relates to executing instructions in a processor.
2. Description of the Related Art
Modern computer systems typically contain several integrated circuits (ICs), including a processor which may be used to process information in the computer system. The data processed by a processor may include computer instructions which are executed by the processor as well as data which is manipulated by the processor using the computer instructions. The computer instructions and data are typically stored in a main memory in the computer system.
Processors typically process instructions by executing the instruction in a series of small steps. In some cases, to increase the number of instructions being processed by the processor (and therefore increase the speed of the processor), the processor may be pipelined. Pipelining refers to providing separate stages in a processor where each stage performs one or more of the small steps necessary to execute an instruction. In some cases, the pipeline (in addition to other circuitry) may be placed in a portion of the processor referred to as the processor core.
To provide for faster access to data and instructions as well as better utilization of the processor, the processor may have several caches. A cache is a memory which is typically smaller than the main memory and is typically manufactured on the same die (i.e., chip) as the processor. Modern processors typically have several levels of caches. The fastest cache which is located closest to the core of the processor is referred to as the Level 1 cache (L1 cache). In addition to the L1 cache, the processor typically has a second, larger cache, referred to as the Level 2 Cache (L2 cache). In some cases, the processor may have other, additional cache levels (e.g., an L3 cache and an L4 cache).
Modern processors provide address translation which allows a software program to use a set of effective addresses to access a larger set of real addresses. During an access to a cache, an effective address provided by a load or a store instruction may be translated into a real address and used to access the L1 cache. Thus, the processor may include circuitry configured to perform the address translation before the L1 cache is accessed by the load or the store instruction. However, because of the address translation, access time to the L1 cache may be increased. Furthermore, where the processor includes multiple cores which each perform address translation, the overhead from providing address translation circuitry and performing address translation while executing multiple programs may become undesirable.
Accordingly, what is needed is an improved method and apparatus for accessing a processor cache.