During normal program execution microprocessors access an external memory for the purpose of retrieving instructions or "code" from memory, and also for the purpose of performing operations involving data or "operands." Memory reads of code, or "fetches," are performed automatically by the microprocessor in response to program execution. Operand accesses are performed in response to execution by the microprocessor of individual instructions.
Fetches are commonly performed in fixed-size blocks of code (e.g., 16 bytes, 32 bytes, etc.). These blocks of code are read from external memory (or cache) and buffered within the microprocessor for subsequent execution. Three types of fetch operations are commonly performed by microprocessors. The first type is a "pre-fetch," which involves the fetching of instructions immediately "downstream" in memory from the instruction being executed. The second type is a "conditional fetch" which is performed when a conditional branch or jump instruction is executed by the microprocessor, causing program execution to begin at a new address. The third type is an "unconditional fetch," which is performed when an unconditional branch or jump instruction is executed.
To perform either a fetch or an operand access, a central processing unit (CPU) of the microprocessor issues a command or "request" to a bus/cache unit of the microprocessor, specifying the type of access to perform. These access requests are typically in the form of a predetermined bit-field of a micro-instruction generated by the CPU. The CPU also generates an address which specifies a memory location for performing the access. For fetches, the address ("fetch address") typically specifies the starting location of the block of code to be fetched. For operand accesses, the address ("operand address") specifies one or more byte locations for performing either a read or a write operation. In either case, the address generated by the CPU is passed to the bus/cache unit along with the corresponding request, and the bus/cache unit performs the specified operation.
For microprocessors that support a virtual memory scheme, addresses generated by the CPU are "virtual addresses" which alone do not identify unique memory locations. Virtual addresses are translated into "physical addresses" (i.e., addresses which correspond to physical memory locations) before being passed to the bus/cache unit. Virtual addresses generated by the CPU may additionally be modified by an addressing unit of the microprocessor before being passed to the bus/cache unit.
Microprocessors are typically designed for use with a single external memory which is shared by both code and data. For this type of design, known as a "pure Von Neumann Architecture" design, the use of the external memory bus by the CPU must be partitioned between a fetch unit of the CPU, which generates fetch requests, and an execution unit of the CPU, which generates operand access requests. Since both types of accesses cannot be performed to external memory simultaneously, the operation of either the fetch unit or the execution unit normally must be suspended when both units have pending access requests. This interference between fetches and operand accesses has the undesirable effect of slowing down program execution. Once the CPU issues a fetch request to the bus/cache unit, the execution unit normally must wait for the fetch access to finish before it can complete execution of an instruction requiring an operand access.
One solution to this problem has been to use two separate external memories, one for code and one for data, so that fetch accesses and operand accesses can be performed simultaneously. This type of design, known as a "pure Harvard Architecture" design, significantly increases the number of pins and the complexity of the microprocessor, since two separate sets of address, data and control lines must be provided by the microprocessor. Thus, pure Harvard Architecture microprocessors provide a solution to the interference problem, but at a very high cost.
The use of a cache memory (cache) internal to the microprocessor helps to reduce interference between fetches and operand accesses. Cache is a form of high-speed memory used by microprocessors to buffer a copy of data and instructions likely to be used by the CPU. When the CPU generates a fetch request or an operand read request, the bus/cache unit initially checks the cache to determine whether the requested instructions or data reside in the cache. If the requested instructions or data are found in cache (referred to as a "cache hit"), the requested instructions or data are returned to the CPU without accessing external memory. If the requested instructions or data are not found in cache (referred to as a "cache miss"), the microprocessor must perform a read from external memory before returning the requested instructions or data. Various algorithms exist for updating the instructions and data held by the cache in order to achieve a high hit rate and to thus reduce the frequency at which the microprocessor must access external memory.
Since only one cache access can be performed at a time, interference between fetch requests and operand access requests exists even when the microprocessor runs strictly out of cache. However, since cache access times are normally considerably lower than access times for external memory, the operation of the CPU is suspended for shorter durations of time when interference between fetch and operand requests occurs. Thus, the use of a single, unified cache for both instructions and data reduces the effect of interference between fetch requests and operand access requests.
To further reduce interference between fetch requests and operand access requests, some microprocessors employ separate caches for code and data. This microprocessor design, which may be referred to as a Harvard Architecture style design, enables the CPU to fetch instructions from an instruction cache while simultaneously accessing operands in a data cache.
Although the use of two separate caches can provide a significant increase in performance over microprocessors having a single cache, the addition of the second cache increases the complexity of the microprocessor. By way of example, a microprocessor having a 2k byte data cache and a 2k byte instruction cache requires considerably more logic than an identical microprocessor having a single, unified cache of 4k bytes for both code and data. As a result, a number of microprocessor manufacturers have opted to use a single unified cache.
As indicated by the foregoing, the design choice between the use of a single, unified cache (referred to as Von Neumann Architecture style designs) and the use of separate data and instruction caches (Harvard Architecture style designs) involves two competing design goals. Harvard Architecture style designs provide for reduced interference between fetches and operand accesses, but at the cost of increasing the amount of microprocessor circuitry.
The present invention is directed to a circuit and method for reducing interference between fetches and operand accesses to cache and external memory for Von Neumann style microprocessors. The general objective of the invention is to obtain some of the performance advantages offered by Harvard Architecture style microprocessors without adding a second cache.