Data structures in computer systems often use tree-organizations to enable search paths to be determined to specific information. A file contains whatever information the user places in it (e.g., an executable program), and may be stored anywhere in the memory of a data processing system. Directories provide the mapping between a file's name or other indicator and the file itself. Within a tree-organized search structure, a "ROOT" is generally found which enables all information to be located by tracing a path through a stated chain of branches, until the desired information is reached. The ROOT of the tree indicates spans of information handled by each of the subservient branches of the next level so that when a specific information search is specified, a search through the ROOT indicates in which subservient branch in the next level further information concerning the search may be found.
Such a tree-structured information array is often used by a key-index data arrangement which is an array of paired entries. The first portion of the entry is a key value and the second is an index value (such as a record number) associated with that key value. A key-index file is accessed by giving the key, with the system returning the key-index pair. This enables files to be accessed by a key name rather than by a record number or byte number.
In a tree-structured information array used by uni-processor computing systems, structures known as balanced trees (i.e., B-trees) have been utilized. B-trees enable search work to be evenly distributed throughout the tree and prevent portions of the tree from being overloaded, while other portions are sparsely populated. A description of B-trees can be found in "The Art of Computer Programming", Volume 3/Sorting and Searching, D. E. Knuth, Addison-Wesley Company, 1973, pp. 473-479.
Tree structured information arrays in parallel computing systems present significantly different problems than those that are present in uni-processor systems. Such parallel computing systems may include a large number of "nodes" that are interconnected by a high speed communication network. Each node generally comprises a processor and a memory, operates independently, and interacts with other nodes via message traffic and transfers of blocks of data over the communication network.
It may be the case in parallel computing systems that file data structures are distributed throughout a plurality of nodes rather than being centrally located in a single node In addition to enabling more efficient diffusion of work among the nodes, such distributed data structures must provide for system recovery in the event of malfunction of one or more nodes. Tree-structured information arrays are one example of files that may be used in such systems and may be distributed among the nodes. When, however, B-tree structures are employed, problems may occur if, during rebalancing nodes of the tree-structure, a node storing a portion of the tree should fail or the entire system should fail. In such case, it is necessary for the system to know (1) whether there was or was not a rebalancing between different nodes in process; (2) what nodes were involved with the rebalancing effort; and (3) at what state was the rebalancing effort when the failure occurred. Knowing these facts, the system, after the failed node has been recovered or the entire system restarted then must reconstruct itself in a manner so as to continue the rebalancing effort, while not losing data in the process.
Parallel computing systems employing tree-oriented data structures may be found in U.S. Pat. Nos. 4,860,201; 4,766,534; 4,412,285; 4,445,171; and U.S. Pat. No. 4,825,354. In the main, those prior art patents are mainly concerned more with operational message transfers between the nodes of a parallel data processing system rather than recovery of data in the event of a failure.
Accordingly, it is an object of this invention to provide a method for data recovery in a parallel computing network.
It is another object of this invention to provide a method for data recovery in a parallel computing network which employs a B-tree information array file structure.
It is a further object of this invention to provide an efficient method for rebalancing a B-tree file data structure across different nodes in a parallel data computing network.