In a distributed lock service system, a coarse-grained mutex mechanism can ensure that only one client terminal can occupy a lock at one time. An implementation of a lock relies on a lease-based session maintaining mechanism, and this session maintaining mechanism ensures that the client terminal detects a timeout at a time earlier than a server terminal detects the timeout upon the timeout of a session. Generally, after detecting a timeout, a client terminal may notify an application layer that a distributed lock is lost, and a server terminal may release this original lock after detecting the timeout, so that other client terminals may contend for the lock.
The foregoing maintenance of a session mainly relies on heartbeat(s) between a client terminal and a server terminal. When a heartbeat protocol is designed, a primary objective is to ensure that a session timeout of the client terminal occurs prior to that of the server terminal in situations where a quick and automatic recovery cannot be realized, e.g., network isolation or server shutdown. In this way, the correctness of a lock service can be ensured. Second, in situations where a system can be automatically recovered quickly, such as network jitter or failover, a client terminal may attempt to report lock loss events to an application program as few as possible, which can ensure the stability of the system.
In existing technologies, a reliability coordination system (such as Zookeeper) in a distributed system adopts a design for heartbeats in which sending and receiving thereof are independent of each other. After a session between a server terminal (i.e., Server or server device) and a client terminal (i.e., Client or client device) is established, the client terminal sends heartbeat requests to the server terminal at fixed sending intervals (⅓ of a session lease period by default). Sending of a current heartbeat request is only driven by time intervals, and does not depend upon whether a response to a previous heartbeat has arrived. After receiving the heartbeat request, the server terminal updates the lease period of the current session to a future moment corresponding to 1 time of a session lease period as long as the current session has not expired, and immediately returns a heartbeat request response. Each time when the client terminal receives a heartbeat request response from the server terminal, the current lease of the client terminal is extended forward to a future moment corresponding to ⅔ of the session lease period. If the lease of the client terminal expires, the client terminal (Client) of the reliability coordination system (ZooKeeper) in the distributed system will directly send an event, which is referred to as a session event, to an application layer, to inform an application program that the session has expired. In a heartbeat protocol of the reliability coordination system (ZooKeeper) in the distributed system, if a temporary network isolation occurs since the last time when the client terminal (Client) successfully received a heartbeat request response, the client terminal (Client) has a buffer time of ⅔ of the session lease period to complete a retry for a heartbeat request. However, the client terminal (Client) of the reliability coordination system (ZooKeeper) in the distributed system simply performs repeating heartbeat requests with a fixed sending interval. Since the client terminal (Client) lacks a reasonable retry logic to cope with various unexpected communication exceptions, the unreasonable retry logic for sending heartbeat requests causes a huge impact and pressure on network nodes and the server terminal (Server). From the perspective of an average number of retry heartbeat requests that are sent, at most two retry heartbeat requests can be initiated within the buffer time of ⅔ of the session lease period, causing the retry logic of the client terminal (Client) to be over-simplified within the buffer time for sending the retry heartbeat requests. As a result, the application program may lose the lock due to a temporary network exception, thus increasing the sensitivity of the client terminal (Client) with respect to network failures.
In existing technologies, the use of a heartbeat protocol of a reliability coordination system (such as ZooKeeper) in a distributed system to maintain a session between a server terminal and a client terminal causes an unreasonable retry logic for sending heartbeat requests and a huge impact and pressure on network nodes and the server terminal (Server). Moreover, the retry logic is overly simple, and as a result, an application program may lose a lock due to a temporary network exception, increasing the sensitivity of the client terminal (Client) with respect to network failures.