The present invention relates to a concurrency control method for database systems, particularly for high-dimensional index structures which use a reinsert operation when an insert operation causes a node overflow. More particularly, the present invention relates to a technique by which the entity, deleted in a node of an index tree for reinsert operation, can be searched.
Last few years, a similarity search based upon a multidimensional data has been an important field of the database systems. The application areas of the similarity search are widely spread from GIS(Geographical Information System) to medical database systems and multimedia database systems.
One of the fundamental problems is how to efficiently find out objects similar to a given query from a number of multidimensional data sets. The researches about multidimensional index trees have been performed very frequently. And various index structures such as Grid File, Multilevel Grid File, R-Tree, R*-Tree, TV-Tree, X-Tree, SS-Tree, SR-Tree, CIR-Tree and Hybrid-Tree have been proposed. In the multi-user environment of real application, an appropriate concurrency control algorithm for index structures is necessarily needed. Most of index structures are expressed by a tree shape (hereinafter refers to index tree). Each element in an index tree is called as a node, and the node makes up the index tree by a hierarchical structure. The index tree is made by inserting/deleting entries into/from it.
In addition, each node of an index tree is able to have only limited number of entries. It is called an overflow when a node has no space to insert a new entry. When an overflow occurs in a node, reinserting some of the entries one by one into the index tree is able to enhance the performance.
The research of a concurrency control method for existing index tree is primarily about B-Tree for one-dimensional data. However, the research of a concurrency control method for R-Tree, index tree for multidimensional data, has begun recently. The concurrency control algorithms for existing multidimensional index tree can be divided into two classes. One is based upon Lock-Coupling Technique and the other is based upon Link Technique. A Concurrency Control algorithm based upon Lock-Coupling acquires lock in next-visiting nodes before unlocking the current node""s lock when traversing the index tree. When the node split is performed or MBR change is reflected, all nodes which relates to these operation should be locked. This method decreases the concurrency performance because one transaction should maintain the locks of many nodes simultaneously on search or insert operation.
A concurrency control algorithm based upon Link Technique is proposed to solve this problem. The previous research proves that the concurrency control method based upon Link Technique has a better performance than that based upon Lock-Coupling Technique. The concurrency control algorithm based upon Link Technique need not perform Lock-Coupling. Namely, the Link Technique has only to acquire lock in one node during the traversing of the tree. However, performing MBR change or split operation still need to acquire locks in many nodes simultaneously.
The basic concept of Link Technique is to connect all nodes on each level via rightward-pointing links and to check and compensate the split of a lower node, which is not reflected upon the upper node, without performing lock-coupling on search operation.
RLink-Tree modifies and compensates the Link Technique of the BLink-Tree because of the structural difference between R-Tree and B-Tree. The modification introduces NSN(Node Sequence Number) and allocates it to each node. And the modification changes the entry structure from  less than MBR, child-page greater than  to  less than MBR, child-page, NSN greater than . The NSN is increased when a new node is made, and each new node is given the NSN of the old node, and the old node is given an increased NSN. And then, the new node is linked to the rightward of the old node. Therefore, the search operation, when traversing the tree, can determine whether the child node is split or not by the value of NSN. The search process, which memorizes the NSN recorded in the entry  less than MBR, child-page, NSN greater than  of the parent node, moves rightward when the value of NSN of child node is larger than the memorized NSN. When the search process meets the node having the same NSN to the memorized NSN, it stops moving sideward and proceeds to perform a search of lower nodes.
This method has some drawbacks. First, when some transactions perform the operation of changing the tree structure (i.e. node split and MBR change), the nodes of two or more levels should be held locks at the same time, and consequently the search process can be delayed during I/O operations. Second, since the data structure of entry changes into  less than MBR, child-page, NSN greater than , the additional information, NSN lowers the efficiency of storage space and consequently the fan out of a non-leaf node is decreased.
The improved method solves the problem which wastes the storage space owing to the added NSN to the data structure of the entry. The improved method also allocates NSN to each node like RLink-Tree, checks the split through the rightward-pointing link, and determines whether to go on making tours through the rightward-pointing link. However, unlike RLink-Tree, the improved method provides the means to determine whether the lower node splits without recording the NSN of the lower node in the entry of the non-leaf node. Allocating NSN by using a global counter enables this.
When a node splits, the new node is given the NSN of the old node, and the old node is given the new NSN. And then, the operation of traversing the tree memorizes the value of the global counter of visited node. The operation of traversing the tree compares the NSN of the current node and the memorized value of the global counter when visiting the lower node and concludes that the split is not reflected on the upper node if the NSN of the current node is larger. The search operation stops moving to neighbor nodes when it meets the node having smaller NSN than the memorized NSN of the global counter and continues making tours to lower nodes. However, this method has a problem. The split of node using a global counter should acquire lock in the upper node before the split of the current node and proceed to split. Besides, it should not release the locks of the nodes which relate to the split operation until the split operation is terminated for the purpose of recovery.
Therefore, an object of the present invention is to provide the method for preventing the problem of lowering the efficiency of the storage space due to the NSN included in the entry of the leaf node by using an LSN as a global counter, and to provide the method for the search operation to refer to the specified node(the reinsertion node) which stores the deleted entries from the tree for the purpose of the reinsert operation.
Another object of the present invention is to provide the method for the lock in the index node not to affect the search operation at all by mixing latch and lock.
A further another object of the present invention is to provide a high searching performance because the present invention acquires only shared-mode latch (hereinafter refers to shared latch) in the visiting node during the search operation and a delay due to the lock of the insert or delete operation doesn""t occur.
In accordance with an aspect of the present invention, there is provided a method for searching a high-dimensional index tree of a database, wherein the concurrency of the database can be controlled, comprising the steps of: a) adding a root node to the queue and acquiring the shared lock for reinsertion node; b) determining whether the queue is empty or not, fetching a node from the queue and assigning the fetched node as a current node if queue is not empty, releasing the shared lock and terminating the search process if queue is empty; c) acquiring the shared latch in the current node, selecting the lower nodes which are within the query range and adding the selected nodes to the queue if current node is not leaf or to the result set if current node is leaf; d) returning to the step b).
In accordance with an aspect of the present invention, there is provided a method for inserting an entry to the high-dimensional index tree of a database, wherein the concurrency of the database can be controlled, comprising the steps of: a) searching a leaf node to which a new entry is inserted; b) determining whether an insert for the new entry to the searched leaf node causes any overflow; c) when the result of the step b) is positive, acquiring a tree lock, performing the process for the overflow of the leaf node and then releasing the tree lock; and d) when the result of the step b) is negative, inserting the new entry to the leaf node, acquiring a tree lock changing a MBR and releasing the tree lock.
In accordance with an aspect of the present invention, there is provided a computer readable medium having computer-executable instructions stored thereon, for performing a method for searching the high-dimensional index tree of a database, wherein the concurrency of the database can be controlled, the method comprising the steps of: a) adding a root node to the queue and acquiring the shared lock for reinsertion node; b) determining whether the queue is empty or not, fetching a node from the queue and assigning the fetched node as a current node if queue is not empty, releasing the shared lock and terminating the search process if queue is empty; c) acquiring the shared latch in the current node, selecting the lower nodes which are within the query range and adding the selected nodes to the queue if current node is not leaf or to the result set if current node is leaf; d) returning to the step b).
In accordance with an aspect of the present invention, there is provided a computer readable medium having computer-executable instructions stored thereon, for inserting an entry to the high-dimensional index tree of a database, wherein the concurrency of the database can be controlled, the method comprising the steps of: a) searching a leaf node to which a new entry is inserted; b) determining whether an insert for the new entry to the searched leaf node causes any overflow; c) if the result of the step b) is positive, acquiring a tree lock, performing a process for the overflow of the leaf node and then releasing the tree lock; and d) if the result of the step b) is negative, inserting the new entry to the leaf node, acquiring a tree lock, changing a MBR and releasing the tree lock.