1. Field of the Invention
The present invention relates to a system for representing the functions of a node stopped due to a failure, etc. in a network where a plurality of nodes are connected, and more particularly to a system for automatically representing the functions of a node whose stopping is detected by network monitoring or a node stopped by instructions.
2. Description of the Related Art
Lately, along with the rapid spread of computers, a lot of processes have been executed by computers. Such computerization not only efficiently processes a lot of data, but also makes possible the quick transmission of a lot of information to a variety of places across the world through a network. As a result, computers connected to the Internet, a wide area network (WAN) or a local area network (LAN), etc. have become indispensable in a lot of activities including business activities.
As the importance of a network and computer nodes connected to the network increases, a system for early detection of the failure of nodes providing a variety of functions on a network and promptly taking appropriate measures has been highly demanded.
In a conventional monitor apparatus for monitoring, nodes to be monitored are fixed. The monitor apparatus transmits a specific signal to one or more nodes to be monitored at predetermined intervals, and by confirming that the nodes respond to the signal and return an answer signal to the apparatus, verifies that the nodes operate normally. This signal transmitted from the monitor apparatus is generally called a heart beat signal or a health signal, and such node verification is usually called a health check or an alive check. The health check or alive check is performed not only by the above simple method but also at a variety of levels, such as an application level, etc., as occasion demands. It can also be verified whether or not a specific function provided by such a node operates normally.
The monitor apparatus is often installed as an apparatus (node) dedicated to monitoring, and when a failure occurs in the node itself, the node ceases to be a monitor apparatus and as a matter of fact, cannot report the stopping of the monitor apparatus to another node, etc.
In a computer system with improved failure-proof properties, nodes to be monitored are often provided in advance with a standby (back-up) node, and generally speaking, in a dual system a hardware dualizing method such that when a failure occurs in a node to be monitored, the node is completely switched over to the standby node, is adopted. Since most of the standby nodes have no other functions but to monitor the nodes when a node to be monitored operates normally, that is, while a running node to be monitored is operating normally, from the characteristic viewpoint of the standby set, the resource and capability of the standby nodes are not sufficiently utilized.
Conventionally the setting of the control procedures of nodes and definition between nodes is also too complicated to unifiedly control the status of a plurality of nodes existing on a network and to operate the nodes, and requires an expensive dedicated system.
Furthermore, when a node functions as a resource server for providing a certain resource, for example, to many and unspecified users through a network, and the contents of the resource are updated, the services provided by the node have to be temporarily stopped.
As described earlier, in a conventional monitor apparatus a monitoring node itself cannot be monitored, and when a failure occurs in the node, this fact cannot be reported to another node.
Furthermore, in order to dualize a node to be monitored it is necessary to provide a dedicated standby (back-up) node to the node, and the resource and capability of the standby node cannot be efficiently utilized while the running node is normally operating.
The setting of the control procedures of nodes and definition between nodes is also too complicated to unifiedly control the status of a plurality of nodes existing on a network and to operate the nodes, and requires an expensive dedicated system.
Furthermore, when the resource contents of a node functioning as a resource server is updated, the services provided by the node have to be temporarily stopped.
It is an object of the present invention to provide a system such that out of a plurality of nodes on a network, one may be set as a master node and one or more as slave nodes, responding to the detection of a failure in a node, the occurrence of a schedule or the occurrence of a variety of events due to a plurality of resource duplication requests, etc., another node can provide functions provided by a node stopped by the event in place of the node, and nodes monitored by the node stopped by the event can be monitored by another node.
It is another object of the present invention to provide an inexpensive system for unifiedly controlling the status of a plurality of nodes using a control node, and operating the nodes.
As the first aspect the present invention presumes a node representation system such that each node can represent functions provided by another node in a system where a plurality of nodes are connected through a network.
The first aspect of the present invention comprises a activation control unit for obtaining information on monitoring when each node is activated, and a monitoring/representing unit for monitoring the operation of a first other node based on information obtained by the activation control unit, monitoring a second other node monitoring the first other node, and representing the monitoring of a second other node monitored by the first other node and functions by provided by the first other node when a failure is detected in the operation of the first other node. Both the activation control unit and monitoring/representing unit of each node control in such a way that a first node may monitor the operation of a second node, the second node may monitor the operation of the first node or a third node, and after that such a monitoring relation is established in order, and one closed-looped logical monitoring network can be constructed.
By adopting such a configuration, nodes can be effectively monitored by making the other nodes. Since, even if a failure occurs in a certain node, the functions of the node are dynamically represented by a node monitoring the node, the system resource can be efficiently utilized and the security of the whole system can be improved as well.
As the second aspect the present invention presumes a node representation system such that each node can represent functions provided by another node in a system where a plurality of nodes are connected through a network.
In the second aspect, out of a plurality of nodes on a network, one may be set as a master node and one or more as slave nodes.
Each of the master node and the slave nodes comprises an activation control unit for obtaining information on monitoring, and a first monitoring/representing unit for monitoring the operation of a first node based on information obtained by the activation control unit, and representing the monitoring of a second node monitored by the first node and functions by provided by the first node when a failure is detected in the operation of the first node. Both the activation control unit and the monitoring/representing unit of each node (both the master node and slave nodes) control in such a way that a first node may monitor the operation of a second node, the second node may monitor the operation of the first node or a third node, and after that such a monitoring relation is established in order, and one closed-looped logical monitoring network can be constructed.
Furthermore, the master node and the slave nodes comprises resource duplication process unit for duplicating a resource from the master node to the slave nodes and enabling the monitoring/representing unit of the master node to represent and perform functions provided by the slave nodes during the resource duplication.
Adopting such a configuration, even when the resource of a certain node is updated, it is unnecessary to stop the node, and services provided by the node can be kept provided to a user without interruption even while the resource of the node is being updated.
As the third aspect, the present invention presumes a node monitor system such that each node can monitor the failures of another node in a system where a plurality of nodes are connected through a network.
In the third aspect, each of the nodes comprises an activation control unit for obtaining information on monitoring when the node is activated and a monitor unit for monitoring the operation of a first other node based on information obtained by the activation control unit, representing the monitoring of a second other node monitored by the first node when a failure is detected in the operation of the first other node.
Both the activation control unit and monitor unit of each node control in such a way that a first node may monitor the operation of a second node, the second node may monitor the operation of the first node or a third node, and after that such a monitoring relation is established in order, and one closed-looped logical monitoring network can be constructed.
By adopting such a configuration, the operating status of all the nodes operated in a system can be monitored and a highly reliable monitor system can be constructed.