In environments that rely on highly-available database access, often a single database instance is accessed by multiple servers over some form of a network, and any one or more of the multiple servers establishes network connections with the database instance. Such a connection is then used to communicate data and control information to and from the database instance and the servers. In the situation where there are multiple servers (e.g., providing services for a payroll application, or providing services for an accounts payable application, etc.), the multiple servers access a single database instance. In such a case, some form of access control (e.g., locks, semaphores) is implemented so as to avoid access collisions (e.g., two servers unknowingly writing to the same database table row at the same time). During ongoing operation, it is possible that one or more of the servers crashes, or it is possible that certain functional components hosted on the one or more of the servers (e.g., their communication connections) experience a crash or other failure.
In legacy implementations of high-availability database systems, only certain failures are deemed recoverable. For example, some legacy systems attempt a (partial) recovery when a connection is lost by merely relying on a client to establish a new connection to replace a failed connection. Such legacy techniques are deficient in at least the regard that in modern database systems, more than one connection (each of which has particular characteristics) might be in use at any moment in time, and legacy techniques do not have the capabilities to manage multiple connections. Further, legacy techniques are deficient in at least the regard that a failure might come in the form of a failed server (e.g., together with any/all services of the failed server, including use of any number of connections), and the legacy implementations have no way of recovering in the event of a failed connection due to a failed server.
Worse, in modern high-availability database system, the existence and configuration of the aforementioned form (or forms) of access control (e.g., locks, semaphores) might be complex (e.g., at a fine-grained level of access), and reclaiming extensive state might need to occur quickly, and with a high degree of fidelity. Legacy techniques are deficient. Still worse, it can sometimes occur that a plurality of servers (and constituent connections) suffer concurrent or nearly concurrent failures (e.g., in a rack/blade situation), and legacy techniques do not address this situation at all, or are inadequate to recover quickly and with a high degree of fidelity with respect to the system as a whole just prior to the failure or failures. Moreover, none of the aforementioned technologies have the capabilities to perform the herein-disclosed techniques for retaining and reclaiming resource locks and client states after one or more server failures. Therefore, there is a need for an improved approach.