The present invention relates generally to systems and methods for predicting the success of queries in information systems comprised of objects and relations between the objects. More particularly, the present invention relates to systems and methods for predicting whether an instance relates to an object without performing an actual query.
As information systems, and especially database systems, grow larger and larger (e.g., into the terra-byte range), so does the cost of querying the databases efficiently. It is not uncommon for a user to hit a database with a complex query only to get “no matching records found” after long minutes of waiting. These empty queries take valuable server resources without producing any useful results.
As the web gains in popularity, the number of users that are allowed to concurrently access or hit such information systems has increased dramatically. Some web sites receive millions of hits per day. It has become increasingly important to be able to detect empty queries and filter them out before they consume valuable resources.
Some of the problems associated with predicting whether a query will produce no records include determining how to know in advance if an instance is related to an object (in other words, are there any instances of that object that relate to the original instance) and, how to do this without accessing the information system or database at run-time. Another problem is to list all the objects to which a particular instance relates.
Some databases currently known in the art support some kind of query cost analysis and prediction. Based on table, index and join-index sizes, the database is able to estimate the time needed to run the query. A smart client will abort queries that will take too long. That gives the user the choice to abort a query based on its cost while this invention enables the user to abort a query based on its predicted result.
Many databases also keep instance-to-instance index tables. If two tables are related through a foreign-key/primary-key relation, the database typically will keep a B-tree index, which has a key that is the foreign-key, and which includes leaves that contain a number pointing to the index file of the primary-key. This permits the database to quickly find all the primary-keys to which specific foreign-keys relate. However, a problem with these B-tree indexes is that they are designed to answer a query, not predict the query result before the query is run. In addition, these tables typically are kept for objects immediately neighboring (i.e., where a direct relation exists).
U.S. Pat. No. 5,848,424 issued on Dec. 8, 1998 to Scheinkman et al., which is incorporated by reference herein for all purposes, discloses a data navigation interface with navigation as a function of draggable elements and drop targets. The interface is based on a drag-and-drop paradigm, whereby the user may drag a draggable element and drop it over a drop target element to create a query. The system makes it possible for the user to generate easily arbitrary ad-hoc queries that are not necessarily foreseen at the time the database is created. It is based on a repository or matrix where object-to-object relations are stored; each entry in the matrix is representative of a type of relation between two classes of objects, one class corresponding to the column of the entry, while the other class corresponds to the line of the entry. The presence of an entry in the matrix, that is the presence of a bit at the crossing of a line and a column of the matrix, is representative of a relation from an object to another object. Even if an object-to-object relation exists, it does not, however, guarantee that an instance of the first object exists that relates to the second object, let alone determine whether specific instances exist. In fact, both objects may be without instances at all, yet the repository will show a relation between them.
Such systems are embodied in the Hyper-Relational Server owned and invented by TopTier Software of San Jose Calif. With a TopTier Hyper-Relational Server, contrary to systems based on the web hypertext metaphor, a user can generate arbitrary, ad-hoc queries. This system provides a solution to the need for enabling a user to easily generate arbitrary queries; it does not provide a solution to the problems listed above, notably to the problem of predicting the results of a query.
Therefore, what is needed is a system and method for predicting whether a query of an information system will result in an empty set, without having to actually run the query.