The present invention relates to retrieving desired information from a data structure that stores the information. More specifically, the invention relates to searching for best matches in tables of information utilizing at least one tree.
Computer systems are great tools for storing vast amounts of data. One way that the data is commonly stored is to utilize tables where each row represents a related subset of data. Oftentimes, the tables form the basic data structure of a relational database management system (RDBMS).
Some types of searches for data within a table are fairly straightforward. For example, if a table stores the names and addresses of customers and one wanted to determine the address of a customer named xe2x80x9cJohn Smith,xe2x80x9d the user could specify that he or she is interested in the address fields of the row that included a name field of xe2x80x9cJohn Smith.xe2x80x9d The computer system could search the rows in the table until the row with the desired name field was found and return the desired address fields. There are many known techniques that allow these simple searches to be performed very accurately and efficiently.
It should be obvious that not all user queries are as simple as described above. For example, a user could be searching for a row that is the best match for a first name field of xe2x80x9cJohnxe2x80x9d and a last name field xe2x80x9cSmith.xe2x80x9d Furthermore, one of the fields may take precedence over the other field (or fields) in the case of a tie. Thus, if there is no row that has a first name field of xe2x80x9cJohnxe2x80x9d and last name field of xe2x80x9cSmith,xe2x80x9d the computer system could return a best match of a customer named xe2x80x9cJack Smith.xe2x80x9d
One traditional approach to implementing a search as described above has been to utilize the construction of perfect hash functions. Although perfect hash functions can be fairly efficient, they are computationally expensive to compute. The expense of perfect hash functions is further increased when the system allows dynamic modification of the data because the dynamic modification of the data generally necessitates recomputation of the perfect hash functions.
What is needed, therefore, are innovative techniques for searching for best matches in tables of information. Additionally, it would be beneficial to provide the searching performance for perfect hash functions, while still allowing the dynamic modification of the data.
The present invention provides techniques for searching for best matches in tables of information. More specifically, an embodiment of the invention stores information from the rows of the table in a tree according to values in a field of the table. When a search is performed, the tree is traversed and information that is encountered is utilized to determine the best match in the table. With the invention, the searching performance can be highly efficient and yet allow for the dynamic modification of the data. Some specific embodiments of the invention are described below.
In one embodiment, the invention provides a computer implemented method of searching for best matches in a table. The table stores information in multiple rows where each row includes multiple fields. In performing a search, a first tree that stores information from the rows of the table according to a first value for a first field is traversed. Information that is encountered when traversing the first tree is saved in a second tree according to data in a second field. The second tree is then traversed according to a second value for the second field to determine the best match for the first and second values. In other embodiments, the search for best matches in the table can be based upon more than two values utilizing more than two trees.
In another embodiment, the invention provides a computer implemented method of searching for best matches in a table. First and second values are received where the first value indicates a desired value for a first field in the table and the second value indicates a desired value for a second field in the table. A first tree is traversed by the first value, wherein the first tree stores at each node information from the rows in the table according to data in the first field. As each node is traversed, the information that is encountered is saved to a second tree according to data in the second field. The second tree is then traversed according to the second value to determine the best match for the first and second values.
Other features and advantages of the invention will become readily apparent upon review of the following description in association with the accompanying drawings.