Computer systems use memory devices, such as dynamic random access memory (“DRAM”) devices, to store data that are accessed by a processor. These memory devices are normally used as system memory in a computer system. In a typical computer system, the processor communicates with the system memory through a processor bus and a memory controller. The processor issues a memory request, which includes a memory command, such as a read command, and an address designating the location from which data or instructions are to be read. The memory controller uses the command and address to generate appropriate command signals as well as row and column addresses, which are applied to the system memory. In response to the commands and addresses, data are transferred between the system memory and the processor. The memory controller is often part of a system controller, which also includes bus bridge circuitry for coupling the processor bus to an expansion bus, such as a PCI bus.
Although the operating speed of memory devices has continuously increased, this increase in operating speed has not kept pace with increases in the operating speed of processors. Even slower has been the increase in operating speed of memory controllers coupling processors to memory devices. The relatively slow speed of memory controllers and memory devices limits the data bandwidth between the processor and the memory devices.
In addition to the limited bandwidth between processors and memory devices, the performance of computer systems is also limited by latency problems that increase the time required to read data from system memory devices. More specifically, when a memory device read command is coupled to a system memory device, such as a synchronous DRAM (“SDRAM”) device, the read data are output from the SDRAM device only after a delay of several clock periods. Therefore, although SDRAM devices can synchronously output burst data at a high data rate; the delay in initially providing the data can significantly slow the operating speed of a computer system using such SDRAM devices.
One approach to alleviating the memory latency problem is to use multiple memory devices coupled to the processor through a memory hub. In a memory hub architecture, a memory hub controller is coupled over a high speed data link to several memory modules. Typically, the memory modules are coupled in a point-to-point or daisy chain architecture such that the memory modules are connected one to another in series. Thus, the memory hub controller is coupled to a first memory module over a first high speed data link, with the first memory module connected to a second memory module through a second high speed data link, and the second memory module coupled to a third memory module through a third high speed data link, and so on in a daisy chain fashion.
Each memory module includes a memory hub that is coupled to the corresponding high speed data links and a number of memory devices on the module, with the memory hubs efficiently routing memory requests and memory responses between the controller and the memory devices over the high speed data links. Each memory requests typically includes a memory command specifying the type of memory access (e.g., a read or a write) called for by the request, a memory address specifying a memory location that is to be accessed, and, in the case of a write memory request, write data. The memory request also normally includes information identifying the memory module that is being accessed, but this can be accomplished by mapping different addresses to different memory modules. A memory response is typically provided only for a read memory request, and typically includes read data as well as an identifying header that allows the memory hub controller to identify the memory request corresponding to the memory response. However, it should be understood that memory requests and memory responses having other characteristics may be used. In any case, in the following description, memory requests issued by the memory hub controller propagate downstream from one memory hub to another, while memory responses propagate upstream from one memory hub to another until reaching the memory hub controller. Computer systems employing this architecture can have a higher bandwidth because a processor can access one memory device while another memory device is responding to a prior memory access. For example, the processor can output write data to one of the memory devices in the system while another memory device in the system is preparing to provide read data to the processor. Moreover, this architecture also provides for easy expansion of the system memory without concern for degradation in signal quality as more memory modules are added, such as occurs in conventional multi drop bus architectures.
Although computer systems using memory hubs may provide superior performance, they nevertheless may often fail to operate at optimum speeds for a variety of reasons. For example, even though memory hubs can provide computer systems with a greater memory bandwidth, they still suffer from latency problems of the type described above. More specifically, although the processor may communicate with one memory device while another memory device is preparing to transfer data, it is sometimes necessary to receive data from one memory device before the data from another memory device can be used. In the event data must be received from one memory device before data received from another memory device can be used, the latency problem continues to slow the operating speed of such computer systems.
Another factor that can reduce the speed of memory transfers in a memory hub system is the transferring of read data upstream (i.e., back to the memory hub controller) over the high-speed links from one hub to another. Each hub must determine whether to send local responses first or to forward responses from downstream memory hubs first, and the way in which this is done affects the actual latency of a specific response, and more so, the overall latency of the system memory. This determination may be referred to as arbitration, with each hub arbitrating between local requests and upstream data transfers.
There is a need for a system and method for arbitrating data transfers in a system memory having a memory hub architecture to lower the latency of the system memory.