In computer systems when there is a need for increased data storage capacity, such as in a search engine server, or in computer systems dedicated to storage-intensive tasks such as editing of video or audio, stock market exchange servers, or air traffic control systems, machines may be deployed with many gigabytes of main memory storage. The systems may also include features that help them to stay operational for long periods of time without crashing, and to detect, tolerate or recover from faults and/or memory failures. Such features are often referred to as Reliability, Availability and Serviceability (RAS) features.
Often, such systems may make use of improved main memory storage designs which incorporate industry-standard DIMMs. The acronym DIMM stands for a Dual In-line Memory Module, typically having a 64-bit data path for access via an internal 64-bit memory bus. A DIMM comprises a series of random access memory (RAM) integrated circuits (ICs) mounted on a printed circuit board. One type of DIMM, known as a fully buffered DIMM (FB-DIMM) also has a device called an Advanced Memory Buffer (AMB). FB-DIMMs can be connected via high speed serial interfaces to a Memory Controller Hub (MCH). The AMB communicates with the MCH via the high speed serial interfaces and with RAM ICs on the DIMM via the internal memory bus. The AMB reads from and writes to the RAM as instructed by the MCH and can also be used to configure the FB-DIMM.
Typically when DIMMs are initialized in main memory storage systems, testing is performed to detect any errors. If errors are detected in a particular DIMM, that DIMM may be dynamically disabled. One drawback to such a scheme is that the detection an error in one particular bank of memory on a DIMM may require disabling of an entire DIMM, which can have a capacity for storing gigabytes of data and a cost in thousands of dollars. The DIMM may also represent a significant fraction of the storage capacity for the entire main memory storage system.
It would be desirable to utilize programmable features of an MCH and/or an AMB to alleviate such drawbacks and to improve the dynamic handling of memory failures and the utilization of memories that incorporate industry-standard DIMMs. To date, the advantages of such programmable features of the MCH and/or the AMB have not been fully utilized.