The present invention relates generally to systems and methods for predicting the success of queries in information systems comprised of objects and relations between the objects. More particularly, the present invention relates to systems and methods for predicting whether an instance relates to an object without performing an actual query.
As information systems, and especially database systems, grow larger and larger (e.g., into the terra-byte range), so does the cost of querying the databases efficiently. It is not uncommon for a user to hit a database with a complex query only to get xe2x80x9cno matching records foundxe2x80x9d after long minutes of waiting. These empty queries take valuable server resources without producing any useful results.
As the web gains in popularity, the number of users that are allowed to concurrently access or hit such information systems has increased dramatically. Some web sites receive millions of hits per day. It has become increasingly important to be able to detect empty queries and filter them out before they consume valuable resources.
Some of the problems associated with predicting whether a query will produce no records include determining how to know in advance if an instance is related to an object (in other words, are there any instances of that object that relate to the original instance) and, how to do this without accessing the information system or database at run-time. Another problem is to list all the objects to which a particular instance relates.
Some databases currently known in the art support some kind of query cost analysis and prediction. Based on table, index and join-index sizes, the database is able to estimate the time needed to run the query. A smart client will abort queries that will take too long. That gives the user the choice to abort a query based on its cost while this invention enables the user to abort a query based on its predicted result.
Many databases also keep instance-to-instance index tables. If two tables are related through a foreign-key/primary-key relation, the database typically will keep a B-tree index, which has a key that is the foreign-key, and which includes leaves that contain a number pointing to the index file of the primary-key. This permits the database to quickly find all the primary-keys to which specific foreign-keys relate. However, a problem with these B-tree indexes is that they are designed to answer a query, not predict the query result before the query is run. In addition, these tables typically are kept for objects immediately neighboring (i.e., where a direct relation exists).
U.S. Pat. No. 5,848,424 issued on Dec. 8, 1998 to Scheinkman et al., which is incorporated by reference herein for all purposes, discloses a data navigation interface with navigation as a function of draggable elements and drop targets. The interface is based on a drag-and-drop paradigm, whereby the user may drag a draggable element and drop it over a drop target element to create a query. The system makes it possible for the user to generate easily arbitrary ad-hoc queries that are not necessarily foreseen at the time the database is created. It is based on a repository or matrix where object-to-object relations are stored; each entry in the matrix is representative of a type of relation between two classes of objects, one class corresponding to the column of the entry, while the other class corresponds to the line of the entry. The presence of an entry in the matrix, that is the presence of a bit at the crossing of a line and a column of the matrix, is representative of a relation from an object to another object. Even if an object-to-object relation exists, it does not, however, guarantee that an instance of the first object exists that relates to the second object, let alone determine whether specific instances exist. In fact, both objects may be without instances at all, yet the repository will show a relation between them.
Such systems are embodied in the Hyper-Relational Server owned and invented by TopTier Software of San Jose Calif. With a TopTier Hyper-Relational Server, contrary to systems based on the web hypertext metaphor, a user can generate arbitrary, ad-hoc queries. This system provides a solution to the need for enabling a user to easily generate arbitrary queries; it does not provide a solution to the problems listed above, notably to the problem of predicting the results of a query.
Therefore, what is needed is a system and method for predicting whether a query of an information system will result in an empty set, without having to actually run the query.
The present invention relates to methods and apparatus for generating an instance-to-object bitmap and using the instance-to-object bitmap to predict whether a query will produce a result. More particularly, in an information system comprising a database having objects and relations, the present invention provides a method for predicting whether a query will produce a result. The method comprises providing an instance-to-object bitmap which indicates whether instances of objects are related to other objects in a database, and accessing the bitmap to determine if the query will produce a result.
The instance-to-object bitmap may be generated off-line by computing paths from instances to neighboring objects by determining a path from an instance in an object to an instance in a neighboring object. Then, paths from instances to non-neighboring objects may be generated by merging a path from an instance in a first object to an instance in a second object with a computed path from said instance in said second object to said non-neighboring object. This can be repeated until paths from instances to remote objects are determined. In accordance with one embodiment of the invention, the lengths of the paths from instances to remote objects may be limited to a predetermined length. For example, a maximum path length of 5 may be used.
In accordance with one embodiment of the present invention, the instance-to-object bitmap may be used with TopTier""s Hyper-Relational Server to determine whether dragging a draggable element onto a drop target will produce a query result.
In accordance with another embodiment of the present invention, the instance-to-object bit map easily can be used to create an object-to-object probability matrix that can be used to determine the likelihood of an arbitrary instance relating to another object. Thus, instead of using the usually larger bit map to unequivocally predict whether a query will produce a result, the probability matrix can be used to estimate the chance that such query will produce a result.
A more complete understanding of the present invention may be derived by referring to the detailed description of preferred embodiments and claims when considered in connection with the figures, wherein like reference numbers refer to similar items throughout the figures.