Distributed computing networks are conglomerations of autonomously-functioning processing units which are arranged in such a way that individual units may be assigned to perform tasks, such as programs, processes, routines, functions, or jobs, which are sub-parts of a larger process for which the network is employed. Distributed computer networks are useful for applications such as control of industrial processes wherein many different operations are performed in a number of physical locations, yet many of the operations need to be coordinated with one another. A computer network for such an application would include a number of autonomous intelligent units or nodes which are each programmed to execute particular tasks and which are connected by a web of communication links so that the nodes may share information with one another.
A type of computer chip called a transputer has been developed for use in parallel processing applications. A transputer chip includes a processor with an internal time-sharing system capable of executing more than one task, on-board memory for storing programs and data, and four bi-directional data ports. Ordinarily, communication between tasks in conventional transputer networks is accomplished via synchronous data channels between tasks. Complex switching networks may be needed to provide an adequate number of data channels. As the number of nodes in such a network increases, the complexity of the switching network increases. In order to facilitate the construction of switching networks for transputers, crossbar switching chips have been developed which can interconnect as many as thirty-two (32) bi-directional ports. Layers of router chips can be used to build switching networks of larger arbitrary sizes. Management of such large switching networks requires considerable software overhead, and tasks on each node must still contend for the use of a finite number of channels.
Another problem with traditional distributed computer networks is that, once designed and programmed, it becomes increasingly difficult to add new nodes or to reassign tasks among nodes since any tasks that may be affected or may need to communicate with tasks at the new location must be reprogrammed to account for the new network topology. Additionally, the switching network then requires reconfiguration to accommodate the new nodes.
As network size increases, the average time between node failures decreases. Hence, for networks employing a fixed routing topology, the maximum network size is determined by individual node reliability and the minimum acceptable time between failures. To improve the reliability of distributed computing networks, it would be desirable to provide a network with the ability to dynamically alter its information routing topology so that the functions assigned to failed nodes could be reassigned to other nodes without interrupting network operation. It would further be desirable to provide a network in which the nodes would have the capability of monitoring the operation of their neighboring nodes and to reboot their neighbors upon detecting a failure.
The problems associated with conventional transputer networks, such as the need for synchronous communication between tasks, the need for router chips to effect switching, and difficulties often encountered in network reconfiguration, have limited the development of flexible, adaptable, modular, intelligent networks.