As the value and use of information continues to increase, individuals and businesses seek additional ways to process and store information. One option available to users is information handling systems. An information handling system generally processes, compiles, stores, and/or communicates information or data for business, personal, or other purposes thereby allowing users to take advantage of the value of the information. Because technology and information handling needs and requirements vary between different users or applications, information handling systems may also vary regarding what information is handled, how the information is handled, how much information is processed, stored, or communicated, and how quickly and efficiently the information may be processed, stored, or communicated. The variations in information handling systems allow for information handling systems to be general or configured for a specific user or specific use such as financial transaction processing, airline reservations, enterprise data storage, or global communications. In addition, information handling systems may include a variety of hardware and software components that may be configured to process, store, and communicate information and may include one or more computer systems, data storage systems, and networking systems.
One common type of information handling system is the network server. A typical server includes one or more processors that execute program instructions, one or more memory modules that store program instructions and data, and a chipset with a memory manager that controls how the processors interact with the memory modules. For example, the INTEL 460GX chipset groups four dual inline memory modules (DIMMs) into a row. Thus, if a user has installed sixteen DIMMs that each have a memory capacity of 1 gigabyte (GB), the chipset creates four rows of memory, and each row contains 4 GB. The storage in the memory rows that the memory manager makes available to the processors is known as the “physical address space.” Typically, the physical address space is described in a memory address map.
A typical server also includes many different layers of software, including, at a very low level, a basic input and output system (BIOS). The BIOS generally handles such tasks as testing the hardware at startup and providing a software interface to certain hardware components during normal operations. At a slightly higher level is the operation system (OS), which provides basic services for high level applications (e.g., web server applications, database engines, etc.) to utilize the hardware components.
In order to protect against memory failures, some chipsets provide what is known as a “spare row feature.” Such a system reserves one row of memory for use in case of a malfunction in one of the non-reserved (or “active”) rows. Specifically, the chipset reserves the spare row by not mapping the memory modules in the reserved row into the physical address space. Therefore, the reserved row is not seen by the operating system. When an error is detected, the BIOS causes the chipset to copy the contents of the failing row to the reserved row and then activates the reserved row by mapping it into the physical address space in place of the failing row.
This process, known as “swapping in the spare row,” happens very quickly, and it does not interrupt the operating system or cause the server to reboot. In fact, the spare row recovery procedure is basically invisible to the operating system. For instance, swapping in the spare row does not affect the memory addresses used by the operating system, and the operating system does not participate in the process of swapping in the spare row.
The spare row feature thus provides a convenient way to recover from memory errors in network servers and other systems that require a high degree of dependability. However, as recognized by the present invention, the spare row feature also presents a number of disadvantages.