The present invention relates generally to a method and apparatus for monitoring the response times of computer system components for the purpose of improving computer system reliability. The invention has particular utility in monitoring the data retrieval response time of memory circuits to enable the identification of memory circuits whose data retrieval response times are drifting away from a desired response time.
As computers are increasingly being used in critical applications, reliability is of increasingly greater importance. One approach to improving computer system reliability is to increase the reliability of individual components of the system. However, this approach is not always possible or economical. What is needed is an approach that economically improves overall reliability of a system without requiring the use of improved individual components.
The response time of an electronic component is the time required for the component to respond to a request, or command, to perform a task. The component may provide a response ready signal or some other direct or indirect indication that the task has been completed. Such an indication is referred to herein as a response ready signal or a ready signal. The response time is also known as the latency. These terms are used interchangeably herein. In the context of memory circuits, the response time, or latency, of the memory circuit is that period from the time the memory circuit is commanded to retrieve stored data until the time that the memory circuit signals that the data is available. For example, the response time may be measured from the time a data read command is asserted by a controller until the time a data strobe, or any other signal or combination of signals indicating that the data is available, is issued.
Advanced memory circuits include provisions for adjusting their response times. For example, the response time of SDRAM (synchronous dynamic random access memory) memory circuits may be adjusted by whole clock cycle increments. A new type of DRAM circuit currently under development, the SLDRAM, can have its response time adjusted both by whole clock cycles and by a portion of a clock cycle. One suggested implementation task for a controller for SLDRAM memory circuits is that the response time for all SLDRAM memory circuits in a system be measured and the slowest response time be identified. The response time of each SLDRAM memory circuit would then be programmed to match the slowest measured response time so that the response time is equalized for all memory circuits, no matter which is performing a data retrieval operation. This process is referred to as calibration. Detailed information pertaining to calibration of SLDRAMs is contained in the respective specification for the particular SLDRAM memory circuit being calibrated. One such example is the 4 M×18 SLDRAM specification, CONS400.P65, Rev. Sep. 22, 1997, the contents of which are hereby incorporated by reference. The actual method for calibrating the SLDRAM memory circuits is beyond the scope of this invention.
Once all the SLDRAM component response times have been initially adjusted, or calibrated, the response times must be monitored to detect changes. Changes in response time are referred to as response time drifts or latency drifts. Latency drifts may be caused by many factors, including environmental conditions such as temperature and power supply fluctuations. Another possible cause for a change in the response time of an SLDRAM component is the onset of a failure. What is needed, therefore, is a controller that can monitor the response times of components, such as memory circuits, both to identify components that exhibit latency drift indications pointing to an approaching failure and to signal the need for recalibration when the components include programmable response delay capabilities.