The invention relates to a method for determining the storage address of a predetermined data value in a memory device, wherein said data values are stored at predetermined storage addresses according to a binary tree data structure of nodes, branches, subtrees and leaves. In particular, the memory device may be addressed with an address consisting of two parts, i.e. a column address part and a row address part, wherein the entries in the memory device are stored in a two dimensional matrix array.
There is often the need of performing a fast search for a particular data value in a memory which contains a great number of entries. For example, assume a situation in which a person""s personal identification number is stored in a memory device and has associated with it certain information about the person""s age, profession or car registration. This identification number may consist of many digits, i.e. in the memory there may be stored a large number of values. A different example is a CD-ROM where telephone numbers are stored and each telephone number has associated with it specific information indicating the home address belonging to that telephone number.
In order to avoid that a second memory device must be used, in which the specific locations of the individual entries are stored, i.e. in which the storage addresses are stored, the memory device containing the plurality of data values is searched for a particular data value and when the data value has been found, the associated information is read out. Of course, the data values (the personal identification number or the telephone number) may be stored randomly in the memory device and in this case the complete memory will have to be searched column- and row-wise to locate the data value, since no information is available whatsoever as to where the data value of interest could possibly be located.
To avoid a searching through the memory device from entry to entry which can take an almost indefinite amount of time regarding the size of the memory device, the data values (not the information associated therewith) is stored in the memory device according to predetermined rules which can be used later to re-locate the data value during the search procedure.
FIG. 5 shows an example of a two-dimensional memory where the data entries are addressed by an address having a row address part A and a column address part B, i.e. the address S= less than A(X), B(X) greater than . X may be regarded as the node in a binary tree where the specific row address and column address intersect. As is seen in FIG. 5, the individual data values D may be stored randomly in the memory device and the usual procedure is to use a binary tree data structure for representing the sorted data values.
As is shown in FIG. 6, in the binary tree data structure each node (the intersection point X where a data value can be entered and read from) is connected to two other nodes. The search through the binary tree (or the memory device having entries mapped thereto) is as follows. A data value D1 is read out from a node X (located at row address A(X) and column address B(X)) and the data value D1 is compared with the data value D whose storage address is to be determined. If D less than D1, then the left branch L of the tree is taken and if D1 less than D the right branch R is taken. The data structure (or the data values in the memory device) isxe2x80x94logicallyxe2x80x94organized in a way, such that the node X2 will have a data value D2 less than D1 and that the data value D3 greater than D1. However, since the individual data values D1, D2, D3 are intrinsically stored randomly in the memory device, it is necessary for the binary tree search thatxe2x80x94after having read out D1 at A(X1),B(X1)xe2x80x94information is provided as to the location or memory address of D2 or D3.
Therefore, conventionally one solution is that each data value D1, D2, D3 has associated (stored) two pointers, i.e. two further memory positions including addresses indicating the location of the data value which is greater (R) or smaller (L) than the read out data value. Whilst logically there is used the binary tree structure, wherein each further sub-level the stored data value will be respectively greater and smaller than in the previous sub-level (subtree) in the memory device itself as shown in FIG. 5, there is only a logical mapping using the address pointers whilst in principle the data values are stored randomly in the memory device.
Using the data entries with different address pointers requires the pre-definition of the route node RN which is the first node from which a data value should be read out and compared with the data value to be searched for. Such a search algorithm is described in xe2x80x9cData Structures and Algorithmxe2x80x9d, Aho Hopcroft, Ullmann; ISDN0-201-00023-7, pages 155 ff. Obviously, there is a disadvantage that the memory space that is needed for storing a great variety of data values requires that with each and every data value two further pointer entries are stored indicating the branches L, R.
Whilst in FIG. 5 there is only an implicit representation of the logical binary tree in the memory device, another solution is that a mapping of the binary tree data structure to fixed addresses in the memory device, i.e. to the elements of the matrix array, is used. In this case the branching addresses are known beforehand through the mapping of the binary tree nodes to pre-defined locations in the memory device and therefore pointers are not necessary here and occupy no memory space. Since the pointers do not have to be evaluated, the search time may be higher, as is disclosed in xe2x80x9cData Structures and Algorithmsxe2x80x9d; Aho, Hoppcraft, Ullmann; ISDN-0-201-00023-7; pp. 271 ff. Using such an explicit mapping of the individual nodes X1, X2, X3 to specific locations in the memory device, the address of the respective xe2x80x9cchildxe2x80x9d nodes X2, X3 within one subtree (!) can be calculated. In fact, representing the left and right branch by the column address A, the left branch is calculated as
A(L(X))=2A(X)+0
whilst the right branch R is calculated as
A(R(X))=2A(X)+1.
However, using such a mapping means that after a short initial traverse of the tree the most significant partition of the address changes in every further step of the traverse. Therefore, the search time may still be considerable for large trees that have to be stored in dynamic random access memory means due to their sizes. For real applications faster methods are required that in particular avoid the changes of the most significant partition of the address as much as possible.
As was explained above, the conventional methods either use the storage intensive pointer scheme or use a calculation of the column address A for deciding on the address of the next node connected to a left branch or right branch in a subtree. This is time-consuming and requires that the most significant part of the address (the row address B) changes in every further step of the traversals to the binary tree).
Therefore, the object of the present invention is to provide a method and an access means for determining the storage address of a pre-determined data value in a memory device, which do not require the usage of pointers for determining storage locations during the binary tree search and which can determine the storage address faster than the above described method where one part of the storage address is calculated instead of using a pointer.
This object is solved by a method according to claim 1 or 25. Furthermore, this object is solved by an access means according to claim 13 or 28.
Essentially, according to the invention it was realized that there is no necessity to perform the frequent changes in the most significant partition of the address but that an algorithm can be provided where not only the column part of the address but also the row part of the address are determined on the basis of the comparison result and the current address. In the present invention, the column address part of the mapping corresponds to the mapping described in the prior art while the row part of the mapping and the important combination of the address from the two parts allows the decrease of the search time until a data value is located in the memory.
According to the invention, when a leaf in a subtree is reached during the search, a new root node of another subtree is determined. If in this subtree the data value is still not found, the search procedure goes to the next leaf tree in the previous subtree and determines a new root node of another subtree. The leaves of the previous subtree are thus sequentially used in order to determine a new subtree. Since a search always means a traverse of the tree from the top to the bottom the search will mainly result in traverses into subtrees. Just a minimum number of steps will result in a change of subtrees. As a result of the mapping, only the column address changes during a traverse inside a subtree. Hence, the number of steps that result in a change of row address is minimized, i.e. the changes in the row address are kept to a minimum. Therefore, the time for a search will be reduced considerably.
Further advantageous embodiments and improvements of the invention may be taken from the dependent claims. Hereinafter, the invention will be described with reference to its embodiments and with reference to the attached drawings.