1. Field of the Invention
The present invention relates to a system and method of organizing nodes within a tree structure, and in particular to a system and method of organizing within a tree structure a plurality of nodes representing physical entities.
2. Description of the Prior Art
It is known to organize and manage a group of physical entities by representing those physical entities as a plurality of nodes within a tree structure, e.g. a binary tree structure. By creating such a tree structure to represent the physical entities, it is then possible to perform searches within the tree structure to find a particular physical entity, or a physical entity that meets some predetermined criteria. For example, the physical entities represented by the nodes of the tree structure may be blocks of memory and a search may be performed to find a particular block of memory, or a block of memory of a size greater than or equal to some predetermined threshold.
Each node within the tree structure will have a number of fields associated therewith. Typically, one field will be identified as a xe2x80x9ckeyxe2x80x9d and this key may be used to organize the nodes within the tree structure, so that the exact location of a particular node is dependent on that key. For example, considering the example where the physical entities may be blocks of memory, the key may be chosen to be the start address of each memory block, and the nodes within the tree may then be organized based on that address key, so that the nodes are sorted on increasing address.
It will be appreciated that by organizing the tree in such a way, it is then easy to perform searches based on the chosen key. However, it is often the case that a search may need to be performed based not only on a single parameter. However, since trees such as binary trees can generally only be sorted on a single key, the value of any other parameter required for searching will typically have to be provided within an auxiliary field associated with each node. The auxiliary field associated with a particular node may specify the value of a parameter which is not specific to that node itself, but also takes into account the value of that parameter as associated with all of its child nodes. For example, returning again to the example where the physical entities are blocks of memory, it may be desirable to perform a xe2x80x9cfirst-fitxe2x80x9d search, which aims to find the block of memory with the smallest address that has a size larger than or equal to a specified size. In such cases, each node would typically have an auxiliary field containing the maximum block size of itself and all of its children, and the nodes would be sorted within the tree by address key.
However, whilst this approach enables such searching to be performed, there is a significant amount of overhead in maintaining the auxiliary fields associated with each node. For example, to ensure predictable searching times, it is desirable for the trees to be balanced, i.e. for the tree to have a fixed maximum depth, and this requires that whenever a node is inserted or deleted, a rebalancing process is performed. Rebalancing is in itself a complicated procedure; for example even the relatively relaxed Red-Black trees require elaborate balancing steps (see xe2x80x9cAn Introduction to Algorithmsxe2x80x9d by Thomas Cormen, Charles Leiserson and Ronald Rivest, MIT Press, 1990, for a description of Red-Black trees). However, this process is further complicated when auxiliary fields are associated with each node, because in such cases the auxiliary fields will need to be recalculated for every node affected by the insertion or deletion. Further, it should be noted that since rebalancing progresses from the leaves towards the root, the tree must either be doubly linked, or a separate list with the path taken from the root must be made during the downward traversal.
It is an object of the present invention to provide an improved system and method for organizing a plurality of nodes within a tree structure.
Viewed from a first aspect, the present invention provides a method of organizing within a tree structure a plurality of nodes representing physical entities, the tree structure defining a number of node locations, each node location being reached via a predetermined path from a root node of the tree structure, the method comprising the steps of: (i) associating first and second keys with each node to be included in the tree structure, the value of at least the first key being unique for each node; (ii) arranging the nodes within the tree structure by sorting the nodes with respect to both the first key and the second key, the sorting with respect to the first key being such that each node may be positioned within the tree structure at any node location along the path from the root node to the node location specified by the first key; whereby a search can be performed for a node within the tree structure based on specified criteria for both the first and second keys.
In accordance with the present invention, each node has both first and second keys associated therewith, the value of at least the first key being unique for each node. Then, the nodes within the tree structure are sorted with respect to both the first key and the second key. In a typical prior art tree structure, such as a binary tree structure, this would not be possible, as the sorting with respect to the first key would preclude a further sorting with respect to any defined second key. However, in accordance with the present invention, the tree structure is defined such that the sorting with respect to the first key is such that each node may be positioned within the tree structure at any node location along the path from the root node to the node location specified by the first key. Hence, the exact placing of nodes based on the first key is less restricted than in the known prior art search trees, and this provides the flexibility to further sort the tree with respect to the second key. As a result, it is possible for a search to be performed for a node within the tree structure based on specified criteria for both the first and second keys.
It will be appreciated by those skilled in the art that the above defined tree structure may be used to represent a number of different types of physical entities. However, in preferred embodiments, the physical entities are free blocks of memory within a memory region, and the first key for each node is an address key identifying an address associated with the block of memory represented by that node.
Preferably, the address key identifies a start address for the block of memory.
In preferred embodiments, each node location has an address range associated with it such that a node positioned at that node location must represent a block of memory whose start address is within that address range, the root node location having the entire address range of the memory region associated with it. It will be appreciated that this approach enables any of the free blocks of memory to be allocated as the root node, since all of the free blocks of memory will fall within the address range associated with the root node location.
In preferred embodiments, the tree structure is a binary tree structure, and a number of the nodes are parent nodes, each parent node having at most two child nodes associated therewith, a first child node being positioned at a node location whose address range covers a first half of the parent node location""s address range, and a second child node being positioned at a node location whose address range covers a second half of the parent node location""s address range. Hence, considering the root node, which can represent any of the free blocks of memory, the only requirement for the two child nodes is that the first child node is in the bottom half of the address range of the memory region, whilst the second child node is in the upper half of the address range of the memory region. It should be noted that the choice of the two child nodes is hence independent of the actual block of memory represented by the root node, and in particular it is hence possible for both child nodes to have an address lower than the address of the root node, or for both child nodes to have an address higher than the address of the root node. This requirement is not specific to the root node and its children, but rather applies to the relationship between any parent node and its two child nodes. Because of this flexibility, it is then possible to provide further sorting based on a second key.
In preferred embodiments, the second key for each node is a size key identifying the size of the block of memory represented by that node, the nodes being sorted with respect to the second key at said step (ii) in order to give the tree structure a heap property, with the root node being the node representing the largest free block of memory.
The heap property is exhibited by the tree structure, since the root node contains the largest free block of memory, and the size of any parent is greater than the size of either of its children. Further, since the nodes are also ordered on address, as the tree is traversed in a first direction the nodes represent blocks with smaller addresses, whilst if the tree is traversed in the opposite direction, the nodes represent blocks with larger addresses.
Such an arrangement enables real-time performance of queries such as first-fit queries, whilst avoiding the drawbacks of having to provide auxiliary fields identifying block size information. In this context, xe2x80x9creal-timexe2x80x9d means that allocation of a memory block and freeing of a memory block take of the order log (N) time (often stated as O(log(N) time), where N is the number of free blocks of memory.
In preferred embodiments, a search can be performed within the binary tree structure to find the free block of memory having the smallest address whilst also having a size equal to or exceeding a specified size, the search comprising performing steps equivalent to executing the steps of: (a) initializing a best first-fit variable; (b) setting a current node to be the root node; (c) if the current node represents a block of memory smaller than the specified size, or if the current node is empty, outputting the best first-fit variable as the search result and terminating the process; (d) if the current node represents a block of memory equal to or larger than the specified size, and having an address lower than the node specified by the best first-fit variable, updating the best first-fit variable to identify the current node; (e) if a first child node is non-empty and represents a block of memory equal to or larger than the specified size, then setting the current node to be the first child node, otherwise setting the current node to be a second child node; (f) repeating steps (c) to (e) until the best-fit variable is output.
Once the desired block of memory has been found, it will typically be allocated for the storage of data, and hence will need to be removed from the tree structure, since the tree only represents free blocks of memory that are available for allocation. Further, when the block of memory is no longer required for the storage of data, it will be freed, and will hence need to be inserted back into the tree structure.
In preferred embodiments, a new node is inserted in the binary tree structure by performing steps equivalent to executing the steps of: (a) setting a current node to be the root node; (b) if the current node is empty, inserting the new node and terminating the process; (c) if the new node has a size larger than the current node, swapping the new node with the current node, such that the new node to be inserted is the smaller node; (d) if the address of the new node is in a first half of an address range associated with the node location of the current node, setting a first child node of the current node to be the current node, or if the address of the new node is in a second half of an address range associated with the node location of the current node, setting a second child node of the current node to be the current node; (e) repeating steps (b) to (d) until the new node has been inserted.
Further, in preferred embodiments, a selected node is removed from the binary tree structure by performing steps equivalent to executing the steps of: (a) if the selected node has no valid child nodes associated with it, removing the reference to the selected node from its parent node, and terminating the process; (b) exchanging the node location of the selected node with the node location of the one of its child nodes that represents the larger block of memory; (c) repeating steps (a) and (b) until the selected node has been propagated to a node location where it has no valid child nodes, and accordingly is effectively removed at said step (a).
It will be appreciated that the above techniques allow insertion and deletion of nodes whilst preserving the ordering with respect to both the first and second key. Hence, the binary search tree will continue to have nodes sorted on address, and with sizes heap-ordered (and hence the tree can be considered to be horizontally sorted on address and vertically sorted on size, assuming an orientation where the root node is at the top, and the leaf nodes are at the bottom of the tree).
Although, as described above, the binary tree structure of preferred embodiments can be used to perform first-fit queries, it is also possible to perform other queries within the binary tree structure. For example, in preferred embodiments, a search for a particular node within the binary tree structure having a specified address key can be made by performing steps equivalent to executing the steps of: (a) setting a current node to be the root node; (b) if the current node is empty, indicating that the particular node has not been found, and terminating the process; (c) if the current node has an address key equal to the specified address key, returning the current node as the search result and terminating the process; (d) if the specified address key specifies an address in a first half of an address range associated with the node location of the current node, setting a first child node of the current node to be the current node, or if the specified address key specifies an address in a second half of the address range associated with the node location of the current node, setting a second child node of the current node to be the current node; (e) repeating steps (b) to (d) until the process is terminated.
In preferred embodiments, the tree structure is based on a radix-2 tree, a basic radix-2 tree being described in xe2x80x9cThe Art of Computer Programming; Sorting and Searchingxe2x80x9d by Knuth, published by Addison Wesly, (c) 1973, where such a tree is referred to as a digital search tree and described in $6.3 Digital Searching. A radix-2 tree is a binary tree where the left/right decision in tree level k is taken depending on bit k of the search key. Typically, radix-2 trees have been used in the prior art to sort nodes based on keys which do not have a finite size. For example, a radix-2 tree might be used to sort character strings. However, in accordance with the present invention, it was realized that if a radix-2 tree is used with a finite key, then the tree will automatically be balanced. This is because a radix-2 tree using D-bits keys will have a maximum depth of D and 2D nodes, making radix-2 trees balanced by definition. Further, in accordance with the present invention, it has been found that radix-2 trees are able to be sorted with respect to a first key such that each node may be positioned within the radix-2 tree at any node location along the path from the root node to the node location specified by the first key, and that given this flexibility it is also then possible to order the nodes with respect to a second key. This allows radix-2 trees with fewer than 2D actual nodes. The actual number of nodes N in the tree must satisfy N=2(Lxc3x97D) with the load factor L in (0, 1] for logarithmic performance (e.g. L=0.5 means the number of elements in the tree is 2D, and the maximum depth D of the tree is twice the average depth Lxc3x97D).
Viewed from a second aspect, the present invention provides a system for managing a tree structure having a plurality of nodes representing physical entities, the tree structure defining a number of node locations, each node location being reached via a predetermined path from a root node of the tree structure, the system comprising: (i) means for associating first and second keys with each node to be included in the tree structure, the value of at least the first key being unique for each node; (ii) a sorter for arranging the nodes within the tree structure by sorting the nodes with respect to both the first key and the second key, the sorting with respect to the first key being such that each node may be positioned within the tree structure at any node location along the path from the root node to the node location specified by the first key; whereby a search can be performed for a node within the tree structure based on specified criteria for both the first and second keys.
Viewed from a third aspect, the present invention provides a computer program product on a computer readable medium for creating and managing with a data processing system a tree structure having a plurality of nodes representing physical entities, the tree structure defining a number of node locations, each node location being reached via a predetermined path from a root node of the tree structure, the computer program product comprising: a key associater for associating first and second keys with each node to be included in the tree structure, the value of at least the first key being unique for each node; a sorter for arranging the nodes within the tree structure by sorting the nodes with respect to both the first key and the second key, the sorting with respect to the first key being such that each node may be positioned within the tree structure at any node location along the path from the root node to the node location specified by the first key; whereby a search can be performed for a node within the tree structure based on specified criteria for both the first and second keys.
Viewed from a fourth aspect, the present invention provides a method of providing a balanced binary tree structure having a plurality of nodes representing physical entities, the binary tree structure defining a number of node locations, each node location being reached via a predetermined path from a root node of the binary tree structure, the method comprising the steps of: (i) using a radix-2 tree for the binary tree structure; (ii) associating a first key with each node to be included in the binary tree structure, the value of the first key being unique for each node and being of a finite size; (iii) arranging the nodes within the binary tree structure by sorting the nodes with respect to the first key, whereby the radix-2 tree is automatically balanced.
As mentioned earlier, in accordance with the present invention, it has been realized that if the first key is chosen such that it is unique and of a finite size, then if the nodes are arranged within a radix-2 tree based on that key, the radix-2 tree will be automatically balanced, hence avoiding the requirement for complex rebalancing techniques to be applied each time a node is inserted or deleted.
In preferred embodiments, the sorting with respect to the first key at said step (iii) is such that each node may be positioned within the binary tree structure at any node location along the path from the root node to the node location specified by the first key, thereby facilitating the further sorting of the binary tree structure with respect to a second key.