In a distributed system, during an initialization phase a node may not have knowledge of what other nodes may exist and be in operation. Nodes may find each other by exchanging discovery messages and then eventually reach the state where all the nodes that are “up” know about each other. On a system with a large number of nodes, however, this message exchange may introduce significant message traffic into the network, which may result in a slow discovery process. While the problem posed is relatively easily solves when there are only a few nodes, when there are a large number of nodes (say 1,000 solely for the sake of example, the lack of information about the status of the other nodes in the distributed network leads to the exchange of numerous messages which can stall message traffic within the system. Moreover the problem grows exponentially with the number, N, of nodes. The presently proposed method of solving this problem focuses on reducing the amount of message traffic and also on providing a more orderly bring-up (discovery) process.
Others have approached similar problems but none include the elements of the present invention. For example, U.S. Pat. No. 6,026,073 discusses node ranking but only for the purpose of determining a “restoration route.” There is no mention of the use of ranking to control message frequency.
Published U.S. Patent Application 2003/0005086 A1 also appears to employ the notion of node ranking but it has no method to deal with contention and no multiple “supervisors.” The use of ranking is for entirely different purposes.
U.S. Pat. No. 6,606,362 also mentions the concept of ranking but only assigns ranks to signal sources so as to allow the signal recipient to select from among several sources.
Published U.S. Patent Application 2004/0146064 A1 uses random delays in the response to reduce the amount of message contention. This is contrary to the teaching of the use of ranking.
Published U.S. Patent Application 2004/0205148 A1 does not employ the concept of ranking and is not directed to the problem of bringing up nodes to go into a cluster. Furthermore, it assumes the pre-existence of a cluster and is rather directed to the problems associated with node failure.
Published U.S. Patent Application 2005/0054346 A1 is essentially unrelated and is connected only by the concept of prioritizing messages according to type-of-service and the routes are prioritized according to quality-of-service.
Published U.S. Patent Application 2005/0071473 A1 appears to describe a method for selecting a limited number of standby subnet messages based on a priority value and possibly a global identifier as a tie-breaker.
U.S. Pat. No. 6,941,350 appears to describe a method for selecting a master network manager during initialization but does not employ node ranking to control message congestion but rather assigns priorities to subnets.
U.S. Pat. No. 6,941,350 entitled “Method and Apparatus for Reliably Choosing a Master Network Manager During Initialization of a Network Computing System,” Frazier, et al., issued Sep. 6, 2005, which is hereby incorporated herein by reference in its entirety, appears to describe a scheme in which nodes are ranked based on priority and the exchange of messages to find out whether a given node should be the network manager. There is no mention of altering frequency according to any priority or ranking.