1. Field of the Invention
The invention relates to memory controllers used in computer systems, and more particularly to memory controllers providing write posting capabilities from multiple buses.
2. Description of the Related Art
Computer systems are becoming ever more powerful by the day. Users are requiring more capabilities to run ever more complicated and sophisticated applications and computer system manufacturers are responding. Computer speeds have dramatically increased over the last number of years so that now desktop and file server computers can readily outperform mainframe computers of 10-15 years ago. But the quest for further performance is never ending. To this end, the microprocessor manufacturers have been developing ever faster microprocessors.
However, a computer system is far more than just a microprocessor. There are many other subsystems that must cooperate with the microprocessor to provide a complete computer system. It is desireable to optimize as many of these subsystems as possible and yet take into account cost and system flexibility to satisfy varied user desires.
Two of the subsystems which have not maintained pace with the development of microprocessor are the main memory systems and the input/output buses. Main memory system shortcomings have been much alleviated by the use of cache memory systems, but in the end all memory operations must ultimately come from the main memory, so that its performance is still a key piece in the overall performance of the computer system. Many advanced memory architectures and techniques have developed over the years. One of the most common techniques is the use of paged mode memory devices or DRAMS, where the actual memory address location value is divided into rows and columns, and if the row address, i.e., the page, is the same for the subsequent operation, only column addresses need to be provided to the DRAM. Although there is a certain amount of overhead required, it easily pays for itself by the improved performance gained during a page hit. So basic page mode operation provides a major performance increase, but more performance is always desired.
One further performance increase relates to an improvement for determining the level of the row address strobe or RAS* signal when the memory system is idle. As is well known, the RAS* signal must be negated or set high to allow a new page or row address to be provided and there is also a precharge time requirement. Thus, there is a performance penalty if the RAS* signal is raised when the next operation is actually a page hit. Similarly, there is a delay if the RAS* signal is kept low and the operation is a page miss, as the full precharge time must also be expended after the cycle has been issued. To address this concern, various techniques have been developed to predict whether the RAS* signal should be kept low or should be returned high to indicate a new page cycle. The prediction can be done several ways, as indicated in U.S. File Wrapper Continuation Application Ser. No. 08/544,109, filed Nov. 17, 1995 (now U.S. Pat. No. 5,651,130), which depends from a parent application Ser. No. 08/034,104 filed Mar. 22, 1993 (now abandoned), entitled "Memory Controller That Dynamically Predicts Page Misses." In that application several techniques are used. A first, simple technique bases the prediction on the type of the last cycle performed by the processor, with the choice always fixed. A second, more sophisticated technique samples the hits and misses for each cycle type and then sets the RAS* level based on this adaptive measurement. But the techniques have been based on the use of the processor cycles and have not based themselves on the I/O bus cycles. Therefore, I/O bus master operations still performed at lesser levels.
One high performance I/O bus is the PCI or Peripheral Component Interconnect bus developed by INTEL.RTM. Corp. and accepted by many computer manufacturers. PCI is a high performance bus and allows numerous bus masters to be present. The bus masters are essentially local processors which perform specific duties, not general processing duties. By having these bus masters, the main processor is able to off load various specialized processing tasks, so that more tasks can be performed in parallel, thereby increasing the performance of the computer system. This is but one example of how parallelism is being used in current computer systems.
It is desirable to have as many operations running in parallel or concurrently as possible to allow increased overall performance. One way this concurrent operation has been done in the past is by the use of write posting, where a single cycle from the processor is latched into a posting buffer and ready is returned to the processor prior to the write cycle actually being completed to the memory or I/O device. The entire data and address values are posted in a latch and then the cycle executed on the target bus when possible. However, write posting has been kept at a very simple level, such as one level per bus, because of complications in memory coherency and cycle ordering which result if deeper posting were to be performed. Therefore, it can be seen that there are numerous gains that could be obtained if one were able to write post more than a single operation to a given bus, if the complications could be simply solved.
The PCI bus provides opportunities to increase overall system performance, particularly that of the memory system. One of the read operations defined for the PCI bus is what is termed as a Memory Read Multiple cycle, which is used to indicate a desire to read a number of cache or memory lines, not just a single line. As noted in the PCI bus cycle definition, this cycle decoding provides an opportunity for the memory controller to start doing read aheads or pipelining so that the data can be obtained prior to actually being required on the PCI bus. With this data then obtained, the memory controller can allow access by the processor, thus further increasing overall system concurrency. However, it is also common for PCI bus masters to abort cycles prior to their completion, and if such an aborted cycle were to occur shortly after a Memory Read Multiple cycle has commenced, then a read ahead operation would have been started and would conventionally complete, only to have the data then immediately discarded. This would reduce overall system performance because of the wasted operations needed to start and complete the full read ahead operation. Therefore, it would be desirable to reduce the wasted time when doing read aheads during Memory Read Multiple cycles when the cycle is aborted early by the bus master on the PCI bus.
Further, personal computer systems are becoming mass market products, and therefore need to be very flexible to meet the widely varying particular goals of users. For example, some users may desire the ultimate in performance with little regard for cost, whereas other users may be significantly more cost sensitive. One area where cost directly impacts performance is in the speed of the memory devices used in the main memory. Another area of impact is the economies of scale which could be obtained by using a single memory controller chip for many different microprocessor configurations and speeds. But using a single memory controller usually involves performance tradeoffs. The memory controller as disclosed in U.S. Pat. No. 5,333,293 addressed the multiple speed processor point, but could use only a single speed of memory devices, thus limiting user options and performance tradeoffs. In another memory controller as disclosed in Ser. No. 08/034,290 filed Mar. 22, 1993, the memory controller can handle different speed memory devices on a bank-by-bank or module-by-module basis, and yet allows optimal timing for each particular memory device. However, this memory controller was designed to be used with a single processor operating at a single speed, thus providing user flexibility but not economy of scale. It would be more desirable to allow numerous types and speeds of processors to be utilized with a single memory controller, and yet allow use of numerous types and speeds of memory devices without requiring great complexity.