Distributed systems may comprise a plurality of modules, at least some of which have associated processor nodes interconnected in a network. The processor nodes typically comprise a processing unit for operating the associated module and a processor interface for providing communication of the processor node in the network. The processing unit executes code, such as computer readable program code, which may be stored in memory, such as a nonvolatile memory, in order to operate the associated module. The modules and associated processors may be termed embedded systems.
An example of a distributed system comprises an automated data storage library which stores removable data storage media in storage shelves, and has at least one data storage drive to read and/or write data on the removable data storage media. An accessor robot transports the removable data storage media, which may be in the form of cartridges, between the data storage drives and the storage shelves. An operator panel allows an operator to communicate with the library, the operator panel also sensing other interaction with the library, such as opening a door and inserting or removing cartridges from the library. Also, a controller controls host interaction with the library, which may include interaction between the host and the data storage drives.
In the example of an IBM 3584 UltraScalable Tape Library, two processor nodes are provided for the accessor robot modules, an accessor controller controls basic accessor functions including cartridge handling by a gripper, accessor work queueing, reading cartridge labels, etc., and an XY controller controls the X and Y motion of the accessor robot. An operator panel controller processor node controls basic operator panel module functions including display output, keyboard input, I/O station sensors and locks, etc. A medium changer controller processor node controls controller module functions including host interaction, including host communications, drive communication, “Ethernet” communications, power management, etc. The processor nodes are interconnected by an network, such as a CAN (Controller Area Network), which comprises a multi-drop network. Other accessor robot modules, and operator station modules may be added, each with the associated processor nodes.
Other examples of distributed systems comprise industrial control systems and automobile and aircraft multi-processor systems.
In the distributed system of coassigned U.S. patent application Ser. No. 09/755,832, filed Jan. 5, 2001, a complete code image is provided for each of the processor nodes which provides code that may be executed for operating any of the modules. In the distributed system of coassigned U.S. patent application Ser. No. 09/734,917, filed Dec. 13, 2000, a master code image is provided by a master source, which may have a nonvolatile store, and may be used to refresh volatile memory of any processor node that has been powered off.
An issue to be addressed is that of backup code, or code that may be employed by a processor node that needs to restore its code image. For example, the code image for one of the processor nodes may become compromised in some way during operation, the code image utilized by a processor node may be partially erased, the module may be replaced and the processor node code image is incorrect, or a processor of a node may be unavailable, such as from the network, when one or more of the other processor nodes are updated. The processor node may then enter an error state, which may require operator intervention. A backup copy of the code must then be located and utilized to restore the functioning of the module of the erroneous processor node. The operator may select a complete code image, comprising the code for all of the processor nodes, from another processor node, or may select a master code image from a master nonvolatile store, but must first be assured that the code image is correct and can serve as a system backup. Impediments to utilizing a complete code image duplicated at each processor node, or at a master source, are the requirement for nonvolatile memory for the full amount of code, and the need to update the complete or master code image even when only the code for one processor node module is actually updated. In the event there are different levels of complete code at different processor nodes, a downlevel complete code at one processor node may not be correct or may not be serviceable as a potential backup for another processor node.