Generally, in a cluster system comprising a plurality of nodes, a shared service may be provided among the plurality of nodes. As used herein, the term “node” refers to any node that includes, but is not limited to, a server, a personal computer (PC), a laptop computer, a tablet computer, a personal digital assistant (PDA), a mobile phone, and a smart phone, and the like.
Resources used by such a shared service may be usually distributed across a plurality of nodes in the cluster system, and use of the service and the corresponding resources may be usually exclusive to an individual node. For example, at one time, there may only be one node or one process in one node that initiates the service and uses the corresponding resources. As used herein, the term “resource” refers to any resource occupied after the service may be initiated, which includes, but is not limited to, a computing resource, such as a central processing unit (CPU) and the like; a storage resource, such as memory, disk, and the like; an input/output (I/O) resource, such as available capabilities of a graphical processing unit (GPU) and the like; and a network resource, such as network bandwidth and the like
In a current cluster system, a conventional approach of controlling the sharing of the resource used by the shared service among a plurality of nodes may be using a dedicated control node to control the sharing. All of the nodes in the cluster may have to interact with the control node if they intend to use the service and the corresponding resource.
For example, when a node, such as node A, in a cluster may intend to initiate a shared service, it may first inquire the control node whether a further node may be currently using the service. If no further node may be using the service, node A may initiate the service. During the period when node A may be using the service, if a further node, such as node B, intends to use the service, node B may also issues an enquiry to the control node. Because node A is using the service, the control node notifies node B that the service is being used, such that node B may not initiate the service. The further node in the cluster may only re-initiate the service until the service ends at node A. When the service ends, node A furthermore has to notify the control node that the service has ended and the corresponding resource may be released. In this way, the control node may notify other nodes in the cluster of the message that the service ends at the node A.
Such an approach of centralized controlling of resource sharing among a plurality of nodes in a cluster by using the control node may require frequent communication interaction between the nodes, which may incur a considerable message overhead. Furthermore, a single-point failure may be prone such an approach. That is, if the control node fails, individual nodes in the cluster may be unable to use the service.