Many previous computers or computer systems employ a memory controller (MC) principally for receiving requests for memory and outputting signals, such as address, row address strobe (RAS) read/write (R/W) and similar signals, for controlling a dynamic random access memory (DRAM) or similar memory to read from or write to the memory. In many such systems, there is, effectively, a single memory requester, such as the microprocessor, to which the memory controller responds.
Among many aspects which affect system performance is the effective bandwidth available for communicating between system components. In a typical previous system, the requester and memory controller (as well as the memory itself) were provided as different components (on separate "chips") and communicated via one or more buses. Accordingly, the effective bandwidth available for sending requests from a requestor to the memory controller and sending control or other signals from the memory controller to the memory represented potential limitations on overall system performance. In relatively older and slower systems, the requestor/memory-controller/memory bandwidth has not typically been the limiting factor on system performance. However, as performance for even moderately-priced computer systems has risen to include systems with clock speeds exceeding, e.g. 100 or 200 megahertz, the requestor/memory-controller/memory bandwidth has, more frequently, become of concern. Accordingly it would be useful to provide a configuration which effectively increases the bandwidth of the communications between the requestor and the memory controller and/or between the memory controller and memory.
Recently, circuit integration has improved to the point that sufficient gate density is available to integrate multiple components on one integrated circuit (IC) or "chip" providing a so-called "system on a chip". Such system level integration (SLI) avoids at least some bus bandwidth issues, but increases the significance of bandwidth issues for remaining buses, including for example, memory controller/memory bandwidth issues. Accordingly, it would be useful to provide a system which takes advantage of the opportunities presented by high gate-density SLI systems, preferably in such a way as to address memory controller/memory bandwidth issues.
Although a certain linear increase in bandwidth can be achieved by increasing bus width (particularly in light of the increased pin count contemplated for "system on chip" devices), a mere increase in bus width, by itself, provides only limited increases in bandwidth and may be insufficient to keep up with bandwidth increases in other parts of the system, such as arise from the use of SLI techniques. Accordingly it would be useful to provide a system which achieves an effective increase in, for example, memory-controller/memory bandwidth, which takes advantage of and exceeds bandwidth increases available from increases in bus width.
Some previous computer systems have involved the use of more than one memory requester. One example is a multiprocessor system in which two or more processors (and/or other devices) may each generate memory requests. In at least some previous multiprocessor systems, all requests from the various memory requestors were directed to a single main memory array, and a memory controller would output a single memory stream to the main memory array. It is believed that outputting of multiple memory streams, e.g. in response to multiple requesters, has not been previously achieved, perhaps because of the degree of complexity involved which can make implementation infeasible, particularly when a memory controller is on a separate integrated circuit (such that requestor/memory-controller communications are performed via a bus). Accordingly, it would useful to provide a way of reducing external memory bottleneck by providing multiple memory streams in a substantially parallel fashion, preferably without introducing an unacceptable level of complexity.
Multiple requester systems can also introduce data coherency issues. For example if one CPU is reading data from a given memory location at about the same time that another CPU is attempting to write data to that location, there is concern whether the first CPU will receive the most recent or valid data. There is, however, a performance "cost" in providing such coherency which typically applies to the entire memory system. Accordingly, it would be useful to provide a multi-stream memory system which can provide needed coherency while reducing the average performance cost of the coherency scheme, compared to certain previous coherency schemes.
Some coherency schemes involve delaying a read request if there is a pending write for the same data, until after the write is complete. While it is clear that such a scheme can achieve or contribute to coherency, the scheme will increase delay or latency in the system, in at least some situations. Accordingly, it would be useful to provide a multi-stream coherent memory system which can reduce or eliminate delay or latency as compared to the delay or latency imposed by previous coherency systems.