1. Field of the Invention
The present invention relates, in general, to a shared-nothing spatial database cluster system and, more particularly, to a recovery method using extendible hashing-based cluster logs in a shared-nothing spatial database cluster, which eliminates the duplication of cluster logs required for cluster recovery in a shared-nothing database cluster, so that recovery time is decreased, thus allowing the shared-nothing spatial database cluster system to continuously provide stable service.
2. Description of the Related Art
A database cluster is a database in which nodes independently capable of providing services are connected to each other through a high speed network and act as a single system. The database cluster provides a division policy, so that a piece of data is divided into small pieces of data and the small pieces of data are managed by different nodes, thus providing high performance to improve simultaneous throughput with respect to an update operation. Further, the database cluster provides a replication policy, so that the duplicates of respective data remain in other nodes, thus providing availability to continuously provide service even if a failure occurs in one node. Further, the database cluster provides idle nodes, so that, if the number of users rapidly increases and a load increases, the idle nodes are used for online extension, thus providing high extensibility to accommodate users who rapidly increase in number.
Such a database cluster includes a shared memory scheme, a shared disk scheme and a shared-nothing scheme, which are shown in FIGS. 1a to 1c. 
The shared memory scheme of FIG. 1a denotes a structure in which all nodes have independent processes and perform operations and can directly access global shared memory and disks. This shared memory scheme is disadvantageous in that a network load excessively increases in order to access the shared memory and in that all processes use the shared memory, so that the disturbance of access to shared resources is increased. Therefore, each node must independently set the size of its cache memory to the maximum.
The shared disk scheme of FIG. 1b denotes a structure in which all nodes include respective processors and memory devices and directly access shared disks to process transactions. This scheme is disadvantageous in that, since all nodes share disks, lock frequently occurs with respect to desired resources, and update operations must be equally performed on all disks. Therefore, as the number of disks increases, the load of update operations increases.
The shared-nothing scheme of FIG. 1c denotes a structure in which respective nodes are implemented as independent systems and separately include their own memory devices and disks. This scheme is advantageous in that, since the dependence of each node on resources is minimized and each node is not influenced by other nodes at the time of processing transactions, extension is easily performed and parallelism for complicated query processing is excellent. Therefore, it is preferable that the database cluster use the shared-nothing scheme that can be easily extended and has excellent parallelism.
In the shared-nothing database cluster, a recovery technique is considered to be very important for high availability. For an effective recovery technique, attempts to reduce a transmission load at the time of transmitting cluster logs, maintained for consistency between nodes, to a recovery node, and to reduce the recovery time of the recovery node, have been made.
Generally, the recovery of the shared-nothing database cluster includes a node recovery procedure of recovering an individual node and a cluster recovery procedure of recovering cluster configuration.
Node recovery is a recovery procedure of maintaining the consistency of data belonging to a node up to the time when a failure occurs in the node. Cluster recovery is a recovery procedure of maintaining the consistency of data from the time at which the node recovery terminates to the time at which the data participate in the configuration of a cluster when a failure occurs in the node.
If a failure occurs in a node, node recovery is performed to maintain the consistency of the node itself. Thereafter, the recovery of cluster configuration is performed, so that the consistency of operations processed after the failure occurred is maintained. The recovery of cluster configuration is completed, so that the failed node resumes normal service with respect to all operations.
Typical database cluster recovery techniques include the recovery technique of ClustRa, the recovery technique of Replication Server, the recovery technique of Group Membership Services (GMS)/Cluster, etc.
FIG. 2 illustrates the system configuration of ClustRa. ClustRa is a main memory-based database cluster, which provides a service of configuring a cluster using non-spatial data. ClustRa has a structure in which nodes independently capable of processing queries are connected to each other through a high speed network, and a master node and a backup node form a single group and maintain the same data duplicate.
ClustRa divides a single piece of data into small pieces of data using a division policy applied between groups, and respective groups independently maintain the small pieces of data, thus increasing simultaneous throughput. Further, ClustRa maintains the same data duplicate in respective groups using a replication policy applied to groups, so that a group having the duplicate can continuously provide service when a failure occurs in another node. However, if a single duplicate exists and a failure occurs in two groups in the worst case, service cannot be provided. Therefore, the rapid recovery of the failed node heavily influences the continuous provision of service.
If a failure occurs in a node, ClustRa performs a recovery procedure using an internal log required to recover the node itself and distribution logs required to recover cluster configuration. The distribution logs are generated to propagate duplicates in a typical query process and must be stored in a stable storage device. The synchronization of distribution logs is controlled in the duplicates by means of the sequence of logs.
A ClustRa node periodically transmits a message “I am alive” to another node in the same group to detect a stoppage, and waits for a response. If a response is not returned in a certain period of time, it is determined that a failure has occurred in the other node. After the failed node completes recovery of itself using an internal log, the node performs cluster recovery by sequentially receiving distribution logs. However, the recovery technique of ClustRa has the following problem. That is, since node-based distribution logs are maintained in a single queue, the maintenance load for distribution logs is increased, and since the distribution logs are sequentially transmitted to a recovery node, recovery time is increased.
Next, Replication Server is a system in which nodes independently capable of processing queries are bundled and constructed as a single server, and which provides only data replication policy without providing data division policy. This Replication Server is constructed using two or more nodes to provide a complete replication technique, thus improving simultaneous throughput. Further, if only a single node is available and two or more nodes are stopped, continuous service can be provided. The system construction of the Replication Server is shown in FIG. 3.
If an arbitrary node is stopped in Replication Server, service is continuously provided by other nodes. If the stopped node is recovered, a transaction system is first recovered, and then cluster configuration is recovered by a replication management system.
At this time, a recovery node sends other nodes a message, indicating that the node has recovered, together with a last log number of the replication management system processed by the recovery node. The nodes, having received the message, select a single node to help the recovery node configure the replication of the recovery node. The selected recovery management node sequentially transmits logs starting from a log subsequent to the log number received from the recovery node, among the logs included in the replication management system belonging to the recovery management node, to all nodes through a group transmission system. The recovery node receives the logs to perform recovery, and can normally process queries after all recovery procedures have been completed.
As such, the Replication Server has the following problems. That is, even backup tables as well as master tables leave logs with respect to all tables of each node, thus increasing log maintenance cost. Further, since database-based logs are maintained, normal service can be provided only after all databases have recovered, thus increasing recovery time.
Meanwhile, GMS/Cluster is a system which has nodes independently capable of processing queries in a shared-nothing structure, and in which 2 to 4 nodes are bundled into a group. The GMS/Cluster uses a complete replication technique allowing all nodes in a group to maintain the same data, so that simultaneous throughput for a search operation is increased. Further, the GMS/Cluster provides availability to continuously provide service even if a failure occurs in one node. The GMS/Cluster provides division policy between groups, thus increasing simultaneous throughput for an update operation and efficiently managing large capacity data. An idle node is a node that does not process queries, and is used for online extension.
However, if a failure occurs in one node, the overall load of processing queries increases. Therefore, rapid recovery is important in order to provide stable service.
FIG. 4 is a system configuration view of the GMS/Cluster. The GMS/Cluster system is implemented so that nodes are connected to each other through a high speed network, and immediately sense a failure when the failure occurs in one node. If a failure occurs in a node, the GMS/Cluster system performs a recovery procedure using a local log required to recover that node and cluster logs required to recover cluster configuration. The local log is equal to a conventional single database log, which must exist in all nodes. The cluster logs are implemented to independently record table-based cluster logs in a master table. If the failed node completes recovery of itself, the node requests cluster logs from other nodes in the group and performs a recovery procedure on the basis of the cluster logs.
However, the GMS/Cluster system is problematic in that, since a plurality of pieces of update information are maintained in cluster logs with respect to a single record if a plurality of operations occurs with respect to the single record, the size of the cluster logs increases and transmission cost increases, and since a recovery node repeatedly performs operations several times with respect to a single record, recovery time increases.