The present invention relates generally to computer processor technology. In particular, the present invention relates to memory access transactions within a computer system having a main memory and one or more cache memories.
A popular multiprocessor computer architecture is formed by coupling one or more processors to a shared main memory storing data, with each processor typically having a local cache to store its own private copy of a subset of the data from the main memory. An important operation to optimize in any computer system, including a multiprocessor system, is the memory access. The memory access time may create a bottleneck impeding performance of a multiprocessor system, especially in the situation where the processor executes its instructions at a faster rate than the memory rate.
A processor executing a memory access seeks to locate the most recent copy of data which may be stored in the main memory or one of the caches. A typical memory reference to retrieve data in a multiprocessor system is resolved by fetching data from the main memory and concurrently probing the caches of the other processors for any other copy of the data and from this information determining the most recent valid copy of the data.
In order to achieve a high performance in a multiprocessor system, the memory reference process to retrieve data is designed to be completed within a specified memory resolution time interval. However, there are two factors which make this specification difficult to achieve. First, in a memory reference, all of the references to the various memories may not take the same amount of time with some references taking longer than others. Sometimes, the return of data from a reference to main memory is faster than the return of data from the probe to a cache. Thus, it is not known definitively which copy of data is the valid copy until the slowest copy is retrieved.
Next, a high performance processor typically employs highly pipelined designs to aggressively reduce the cycle time of the processor. One ramification of highly pipelined designs is that before a fill to a processor""s cache memory can occur, the pipeline must be primed to prepare the pipeline for data reception. This requires a lead-off notice time to schedule the internal resources of the pipeline. This lead-off time further reduces the amount of time available for the multiprocessor system to completely resolve the memory reference from all possible sources of the data.
Thus, a solution is needed to reduce the memory resolution time interval which would subsequently reduce the time for memory access.
According to the present invention a cache within a multiprocessor system is speculatively filled. The multiprocessor system includes a main memory coupled to a plurality of processors having a plurality of caches. Each processor may have one or more internal or external caches. To speculatively fill a designated cache, the present invention first determines an address which identifies information located in the main memory. The address may also identify one or more other versions of the information located in one or more of the caches. The process of filling the designated cache with the information is started by locating the information in the main memory and thus locating other versions of the information identified by the address in the caches. The validity of the information located in the main memory is determined after locating the other versions of the information. The process of filling the designated cache with the information located in the main memory is initiated before determining the validity of the information located in main memory. Thus, the memory reference is speculative.
Typically, the information located in the main memory is invalid if one of the other versions of the information is a more recent, e.g., more current, version than the information located in the main memory, and is valid if the information located in the main memory is more recent than the other versions of the information. More generally, the validity of the information located in the main memory may be determined to be invalid or valid based upon a cache protocol. The process of filling the designated cache is canceled after determining that the information located in the main memory is invalid.
Beneficially, the process of filling the designated cache has a primer stage, wherein a pipeline associated with the filling process is prepared for delivery of the information located in the main memory, followed by a delivery stage, wherein the information located in the main memory is delivered to the designated cache. The process of filling the designated cache is canceled upon determination that the information located in the main memory is invalid before the information located in the main memory is delivered to the designated cache in the delivery stage. The process of filling the designated cache is completed without interruption after determining that the information in the main memory is valid.
In accordance with other aspects of the present invention, a computer system includes a main memory and a computing apparatus connectable to a cache. The cache may be internal or external to the computing apparatus. The computing apparatus includes address circuitry for presenting an address to the main memory. The computing apparatus also includes cache fill circuitry for receiving information from the main memory corresponding to the address and for speculatively filling a section of the cache with the information. The computing apparatus additionally includes validation circuitry for receiving a validation signal having a value indicating whether the main memory information is valid or invalid. The cache fill circuitry initiates a speculative fill of the cache with the main memory information before the validation circuitry receives the validation signal.
The computing apparatus advantageously includes cache fill abort circuitry for canceling the filling of the section of the cache with the main memory information if the value of the validation signal indicates the main memory information is invalid, or for allowing the filling of the section of cache to continue if the value of the validation signal indicates that the main memory information is valid. The value of the validation signal is determined after the cache fill circuitry begins filling the cache with information. Thus, the cache fill is speculative.
The cache fill circuitry may be configured as a pipeline. Beneficially, the cache fill circuitry includes a data setup stage for preparing the main memory information for delivery to the cache, and a data delivery stage for receiving the main memory information from the data setup stage and delivering the main memory information to the cache. If the value of the validation signal indicates that the main memory information is invalid, the process of filling the cache is canceled by the cache fill abort circuitry before the main memory information is received by the information delivery stage.
In a further embodiment, the computing apparatus may also include a memory controller for receiving the address from the address circuitry, locating the main memory information corresponding to the address and providing the main memory information to the cache fill circuitry. Typically, the memory controller sets the value of the validation signal to indicate the main memory information is invalid if the main memory information is not the most recent version of the information. Generally, the memory controller sets the value of the validation signal based upon a cache protocol.
In accordance with still other aspects of the present invention, a multiprocessor system includes a main memory configured to store information, a memory controller, coupled to the main memory, and a plurality of processors, each processor coupled to the memory controller. A processor includes a cache and a system port configured to receive main memory information from the memory controller. Each processor also includes cache fill circuitry for filling a section of the cache with the main memory information, validation circuitry for receiving a validation signal having a value indicating whether the main memory information is valid or invalid, and cache fill abort circuitry for canceling the filling of the section of the cache if the value of the validation signal indicates the main memory information is invalid. The value of the validation signal is determined after the point in time when the cache fill circuitry begins the process of filling the cache with the main memory information, the cache fill is thereby considered to be speculative.
The memory controller retrieves the main memory information from the main memory and generates the validation signal by setting the value of the validation signal to invalid if a more recent version, e.g. a more recent copy, of the information than that in the main memory is located in one of the caches, or setting the value of the validation signal to valid if the copy of the information in memory is the most recent version.
According to yet other aspects of the present invention, a computer system includes a memory reference unit for selecting an address identifying a selected section of information in a memory, an address bus for supplying an external system, typically, a memory controller, with the address for the selected section of information, and an information bus for receiving the selected section of information from the external system in response to the address. The computer system further includes a cache for storing sections of information from the memory, a validation pin for receiving a validation signal indicating whether the selected section of information is valid or invalid according to a cache protocol, and cache fill circuitry for filling one of the sections of the cache with the selected section of information. The cache fill process is initiated before determining if the selected section of information is valid or invalid.
Further, the computer system includes cache fill abort circuitry for canceling the filling of the selected section of the cache if the validation signal indicates the information is invalid. The validity of the selected section of information is determined at a point in time after the point in time when the cache fill circuitry begins filling the cache with the information. Thus, the cache fill is speculative.
Objects, advantages, novel features of the present invention will become apparent to those skilled in the art from this disclosure, including the following detailed description, as well as by practice of the invention. While the invention is described below with reference to a preferred embodiment(s), it should be understood that the invention is not limited thereto. Those of ordinary skill in the art having access to the teachings herein will recognize additional implementations, modifications, and embodiments, as well as other fields of use, which are within the scope of the invention as disclosed and claimed herein and with respect to which the invention could be of significant utility.