1. Field of the Invention
The present invention generally relates to computer systems, and more specifically to a method of upgrading or servicing computer components, particularly system memory (RAM), without powering down the computer system or otherwise interrupting service.
2. Description of Related Art
Modern computing systems are often constructed from a number of processing units and a main memory, connected by a generalized interconnect. The basic structure of a conventional multi-processor computer system 10 is shown in FIG. 1. Computer system 10 has several processing units 12a, 12b, and 12c which are connected to various peripheral, or input/output (I/O) devices 14 (such as a display monitor, keyboard, and permanent storage device), memory device 16 (random-access memory or RAM) that is used by the processing units to carry out program instructions, and firmware 18 (read only memory) whose primary purpose is to seek out and load an operating system from one of the peripherals (usually the permanent memory device) whenever the computer is first turned on.
Processing units 12a-12c communicate with the peripheral devices, memory and firmware by various means, including a bus 20. Computer system 10 may have many additional components which are not shown, such as serial and parallel ports for connection to, e.g., modems or printers. Those skilled in the art will further appreciate that there are other components that might be used in conjunction with those shown in the block diagram of FIG. 1; for example, a display adapter might be used to control a video-display monitor, a memory controller can be used to access memory 16, etc. The computer can also have more than three processing units. In a symmetric multi-processor (SMP) computer, all of the processing units 12a-12c are generally identical, that is, they all use a common set or subset of instructions and protocols to operate, and generally have the same architecture.
Conventional computer systems often allow the user to add various components after delivery from the factory. For peripheral devices, this can be accomplished using an "expansion" bus, such as the Industry Standard Architecture (ISA) bus or the Peripheral Component Interconnect (PCI) bus. Another component that is commonly added by the user is main memory (16). This memory is often made up of a plurality of memory modules that can be added or removed as desired. The memory modules usually have memory chips in dual in-line packages, mounted on a single circuit board, and so are referred to as dual in-line memory modules (DIMMs).
DIMMs can be added to upgrade a system's memory, or to replace older modules that have become defective. Each DIMM has an edge with a plurality of contacts or pins (e.g., 72 pins), adapted to mate with an edge connector (socket or slot) mounted on a memory card or on the primary circuit board ("motherboard") of the computer system. The slot connectors for the memory modules are often arranged in two or more banks on the memory card. DIMMs are available in different sizes, not only with respect to physical size, but also with respect to the amount of memory that they provide. For example, DIMMs used with personal computers (PCs) often in sizes of 8 megabytes, 16 megabytes, 32 megabytes, 64 megabytes and 128 megabytes.
When a user desires to upgrade or service system memory, the computer must generally be powered down prior to addition or replacement of DIMMs. After the maintenance is performed, the computer is re-started, the basic input-output system (BIOS) residing in the firmware tests the memory, and makes the memory available to the operating system which is thereafter loaded by the firmware.
For many computer systems (particularly large servers used in a client-server network), there may be hundreds of users connected to it, and the down time required to perform a memory upgrade and service can be extremely expensive. Also, in systems which are used in mission-critical applications, it is highly desirable to be able to perform a memory upgrade and service operation without interrupting service operation.
One solution for uninterrupted performance during a memory upgrade or service is to provide two essentially identical memory cards, and mirror the memory on each card. This construction offers other advantages as well--data can be read from either card, so two memory reads can be processed simultaneously (in the case of memory writes, both memory cards need to be written). When a memory service or upgrade is required, the subject memory card can be quiesced and powered off. This card can then be removed, upgraded and/or serviced. The upgraded (or replacement) memory card is then plugged back into the system, and can be brought back on-line after running a routine that writes to the upgraded (or replacement) memory card the contents of the mirrored region by reading from the resident memory card in the system, so that the memory cards are again mirrored.
The foregoing solution can turn out to be very expensive, especially if the memory cards are populated with, e.g., gigabytes of memory (which is becoming common for high-performance servers). From one perspective, this construction effectively reduces available memory by half. It would, therefore, be desirable to provide a method of upgrading or servicing computer memory without requiring a powering down or interruption of the system, and which further did not require redundant memory modules that are so wasteful/expensive.