Computer systems use memory devices, such as dynamic random access memory (“DRAM”) devices, to store data that are accessed by a processor. These memory devices are normally used as system memory in a computer system. In a typical computer system, the processor communicates with the system memory through a processor bus and a memory controller. The processor issues a memory request, which includes a memory command, such as a read command, and an address designating the location from which data or instructions are to be read. The memory controller uses the command and address to generate appropriate command signals as well as row and column addresses, which are applied to the system memory. In response to the commands and addresses, data are transferred between the system memory and the processor. The memory controller is often part of a system controller, which also includes bus bridge circuitry for coupling the processor bus to an expansion bus, such as a PCI bus.
Although the operating speed of memory devices has continuously increased, this increase in operating speed has not kept pace with increases in the operating speed of processors. Even slower has been the increase in operating speed of memory controllers coupling processors to memory devices. The relatively slow speed of memory controllers and memory devices limits the data bandwidth between the processor and the memory devices.
In addition to the limited bandwidth between processors and memory devices, the performance of computer systems is also limited by latency problems that increase the time required to read data from system memory devices. More specifically, when a memory device read command is coupled to a system memory device, such as a synchronous DRAM (“SDRAM”) device, the read data are output from the SDRAM device only after a delay of several clock periods. Therefore, although SDRAM devices can synchronously output burst data at a high data rate, the delay in initially providing the data can significantly slow the operating speed of a computer system using such SDRAM devices.
One approach to alleviating the memory latency problem is to use multiple memory devices coupled to the processor through a memory hub. In a memory hub architecture, a system controller or memory controller is coupled over a high speed data link to several memory modules. Typically, the memory modules are coupled in a point-to-point or daisy chain architecture such that the memory modules are connected one to another in series. Thus, the memory controller is coupled to a first memory module over a first high speed data link, with the first memory module connected to a second memory module through a second high speed data link, and the second memory module coupled to a third memory module through a third high speed data link, and so on in a daisy chain fashion.
Each memory module includes a memory hub that is coupled to the corresponding high speed data links and a number of memory devices on the module, with the memory hubs efficiently routing memory requests and responses between the controller and the memory devices over the high speed data links. Computer systems employing this architecture can have a higher bandwidth because a processor can access one memory device while another memory device is responding to a prior memory access. For example, the processor can output write data to one of the memory devices in the system while another memory device in the system is preparing to provide read data to the processor. Moreover, this architecture also provides for easy expansion of the system memory without concern for degradation in signal quality as more memory modules are added, such as occurs in conventional multi-drop bus architectures.
Although computer systems using memory hubs may provide superior performance, they nevertheless may often fail to operate at optimum speeds for a variety of reasons. For example, even though memory hubs can provide computer systems with a greater memory bandwidth, they still suffer from latency problems of the type described above. More specifically, although the processor may communicate with one memory device while another memory device is preparing to transfer data, it is sometimes necessary to receive data from one memory device before the data from another memory device can be used. In the event data must be received from one memory device before data received from another memory device can be used, the intervention of the processor continues to slow the operating speed of such computer systems. Another one of the reasons such computer systems fail to operate at optimum speed is that conventional memory hubs are essentially single channel systems since all control, address and data signals must pass through common memory hub circuitry. As a result, when the memory hub circuitry is busy communicating with one memory device, it is not free to communicate with another memory device.
One technique that has been used in computer systems to overcome the issues with processor intervention in moving data to and from memory as well as the single channel bottleneck is the use of direct memory access (DMA) operations. DMA operations are implemented through the use of DMA controllers included in the computer system which enable data to be moved into and out of memory without the intervention of the system processor. Such DMA operations and DMA controllers are well known in the art, and are often implemented in conventional computer systems. The DMA controller removes the need for the processor to be involved and manages the required data transfers into and out of the system memory. For example, when a DMA supported entity transfers data to the system memory, the DMA controller obtains control of the bus and coordinates the transfer of the data from the DMA supported entity to the system memory, without involvement by the processor. In this manner, latency issues resulting from processor intervention can be avoided during data transfers across the system bus. However, in many instances, even after data has been transferred to the system memory through a DMA operation, the processor nevertheless must move blocks of the data from one location to another within the system memory. For example, the operating system will direct a DMA operation to transfer data from a mass storage device into the system memory, only to have the processor then move the data again to another location in memory so the data can be used. As a result, the value of having DMA operations is diminished to some degree because the processor ultimately becomes involved by moving data around in memory despite the use of a DMA operation in the data transfer to and from the system memory.
Therefore, there is a need for a computer architecture that provides the advantages of a memory hub architecture and also minimizes the latency problems common in such systems.