In the latter half of the twentieth century, there began a phenomenon known as the information revolution. While the information revolution is a historical development broader in scope than any one event or machine, no single device has come to represent the information revolution more than the digital electronic computer. The development of computer systems has surely been a revolution. Each year, computer systems grow faster, store more data, and provide more applications to their users.
A modern computer system is an enormously complex machine, usually having many sub-parts or subsystems, each of which may be concurrently performing different functions in a cooperative, although partially autonomous, manner. Typically, the system comprises one or more central processing units (CPUs) which form the heart of the system, and which execute instructions contained in computer programs. Instructions and other data required by the programs executed by the CPUs are stored in memory, which often contains many heterogenous components and is hierarchical in design, containing a base memory or main memory and various caches at one or more levels. At another level, data is also stored in mass storage devices such as rotating disk drives, tape drives, and the like, from which it may be retrieved and loaded into memory. The system also includes hardware necessary to communicate with the outside world, such as input/output controllers; I/O devices attached thereto such as keyboards, monitors, printers, and so forth; and external communication devices for communicating with other digital systems. Internal communications buses and interfaces, which may also comprise many components and be arranged in a hierarchical or other design, provide paths for communicating data among the various system components.
For many reasons, computer systems are usually physically constructed of multiple modular components, each having some pre-defined interface to one or more other components. From the standpoint of the system, a modular component may be viewed as a “black box” which conforms to the pre-defined interface. Any component, regardless of internal structure, which conforms to the same pre-defined interface can be substituted for an existing component in the design of the computer system. This approach enables considerable flexibility in the design and configuration of computer systems. It is possible to improve the design of a computer system by improving the internal design of a modular component (while conforming to the same interface), without affecting other components of the system.
The modularization of computer components may be viewed at any of several levels of generality. At a relatively low level, individual logic gates on a semiconductor chip may be viewed as modular components. Functional units within the chip, such as an adder or an instruction decoder within a processor, are a higher level of modular component. The entire chip itself is also a modular component at still a higher level. This chip is usually mounted on a printed circuit board with other chips, connectors, and discrete components, the printed circuit board assembly being another level of modularity. Multiple circuit boards or other components may be constructed as a functional unit at a still higher level of modularity.
At some level of modularity, the component may be designed to be physically replaceable with an equivalent component after manufacture of the computer system. I.e., the component will be coupled to other components in the system using electrical connectors, clips, threaded fasteners, and the like, which are designed for coupling and uncoupling after manufacture. Such physically replaceable components are referred to as “field replaceable units” (FRUs). A finished electronic circuit board assembly, complete with all components soldered in place, is often designed as such a FRU, while an integrated circuit chip typically is not.
The use of such field replaceable units not only facilitates the replacement of a defective component with a new component of identical type after system manufacture. It also facilitates the re-configuration or upgrade of an existing physical computer system by substituting a newer or enhanced version of a FRU for the existing unit. For example, a memory card (i.e., an electronic circuit card assembly containing multiple memory chips) may sometimes be replaced with another memory card conforming to the same interface, but having more memory, or faster memory, or some other enhancement. Additionally, a computer system may be constructed with unused couplings, which provide support for later upgrading the computer system by attaching additional FRUs to the unused couplings. The use of multiple types of FRUs attached to generic couplings enables a basic computer system design to be configured in any one a very large number of configuration permutations.
Information concerning the physical configuration of a computer system generally needs to be maintained somewhere on the system in order to support any of various operating system and diagnostic functions. In early computer systems, such configuration information was often manually input by a system administrator. As systems have evolved, the task of collecting and maintaining configuration information has become increasingly automated in order to cope with growing system complexity, and the fact that many systems lack trained full-time administrators. To support configuration, diagnostics, and other automated functions, it is desirable to maintain self-identifying and diagnostic data for each FRU within the FRU itself. Accordingly, in some computer system designs, each FRU contains an on-board non-volatile data storage having certain vital component data. This vital component data may include, among other things, a device type, a serial number, a part number, a version number, functional parameters (such as an amount of memory available on a memory card), and so forth. Vital component data may be used by various system functions for purposes of verifying component compatibility, configuring low-level operating system functions, isolating system faults, and so forth. The on-board non-volatile data storage might also be used for storing data of lesser importance, such as a history of generated error codes.
Vital component data is generally vital, meaning that the system requires the data for proper operation, and any corruption of the data (even so much as a single bit) could have serious operational consequences. In most conventional memory devices, random bit errors are extremely rare. However, the fact that a FRU may be subject to extreme conditions of shipment, storage and handling increases the probability of a random bit error. Since any error in the vital data is unacceptable, vital data is often protected using some form of data redundancy, or error correcting code (ECC).
An ECC is a set of bits of pre-defined length and format, which enable the detection and correction of certain errors in a string of data bits of pre-defined length. By breaking data up into segments of pre-defined length and adding corresponding segments of ECC of known size and format, it is possible to protect any arbitrary data string from a single bit error, and in some cases to correct even multi-bit errors.
One of the requirements of ECC is that the data and ECC segments have a known, pre-defined format, since one must be able to distinguish the data bits from the bits of ECC. This works well where records being protected have a known, regular, fixed size. Unfortunately, vital component data tends to be heterogeneous in nature. Each record in the vital component data may be of a different size. Furthermore, different types of FRUs may employ different vital component parameters. Newer versions of the same type of FRU may contain additional vital component data records not contained in the back-level versions, or may contain data records of different size than those of the back-level versions.
It is possible to store vital component data in fixed-size blocks of data and ECC, but this approach generally either wastes storage space (because of padding or unnecessary protection of data less vital) or reduces flexibility to re-define the records of the vital component data. It is desirable to store certain data, such as vital component data, in a protected format which is more flexible, space efficient or otherwise advantageous compared with conventional techniques.