Databases are used to store information for an innumerable number of applications, including various commercial, industrial, technical, scientific and educational applications. As the reliance on information increases, both the volume of information stored in most databases, as well as the number of users wishing to access that information, likewise increases. As the volume of information in a database, and the number of users wishing to access the database, increases, the amount of computing resources required to manage such a database increases as well.
Database management systems (DBMS's), which are the computer programs that are used to access the information stored in databases, therefore often require tremendous resources to handle the heavy workloads placed on such systems. As such, significant resources have been devoted to increasing the performance of database management systems with respect to processing searches, or queries, to databases.
Improvements to both computer hardware and software have improved the capacities of conventional database management systems. For example, in the hardware realm, increases in microprocessor performance, coupled with improved memory management systems, have improved the number of queries that a particular microprocessor can perform in a given unit of time. Furthermore, the use of multiple microprocessors and/or multiple networked computers has further increased the capacities of many database management systems.
From a software standpoint, the use of relational databases, which organize information into formally-defined tables, and which are typically accessed using a standardized language such as Structured Query Language (SQL), has substantially improved processing efficiency, as well as substantially simplified the creation, organization, and extension of information within a database. Furthermore, significant development efforts have been directed toward query “optimization”, whereby the execution of particular searches, or queries, is optimized in an automated manner to minimize the amount of resources required to execute each query. In addition, a reduced reliance on runtime interpretation of queries in favor of increased usage of directly-executable program code has improved query engine performance.
Through the incorporation of various hardware and software improvements, many high performance database management systems are able to handle hundreds or even thousands of queries each second, even on databases containing millions or billions of records. However, further increases in information volume and workload are inevitable, so continued advancements in database management systems are still required.
To assist in reducing the number of times a database is accessed, occasionally it is beneficial to perform further queries on the results of a previous query to further refine the results. Querying of query results is the ability to re-query an existing record set. For example, an application may have an interface that shows all the users in a particular system. A query, similar to “SELECT*FROM users ORDER BY lastName”, may be used to generate a set of results, e.g. a set of all users, ordered by last names, which may then be output for use on a web page. The web page may list the alphabet across the top of this output so that a user could click on the letter “J”, for instance, to see everyone whose last name starts with “J”. The database would then need to be requeried with a query similar to “SELECT*FROM users WHERE lastName LIKE ‘j%’ ORDER BY lastName”.
A downside to using a method such as this is that two separate queries must be sent to the database, generating two complete result sets and consuming the full database resources required to execute two queries, even though the second query returns a subset of the result set that was already returned by the first query. The bandwidth needed for a single query for a single request from a single user may not be a problem, but using these multiple queries on a site that gets thousands or millions of hits a day can be problematic, and put an undue burden on the system trying to handle the extra queries and loads.
One method of reducing the burden would be to query the query results from the initial query rather than performing the second query on the database itself. Performing queries on query results has many benefits. If there is a need to access the same tables multiple times, the access time for modestly sized tables may be greatly reduced because the data may already be in memory, or cached and easily retrieved. Joins and unions may be performed on results from different data sources. For example, a union may be performed on queries from different databases to eliminate duplicates for a mailing list. Cached query results may be efficiently manipulated in different ways. A database may be queried once and then the results of the query may be used to generate several different summary tables. For example, if there is a need to summarize the total salary by department, by skill, and job, one query of the database may be run and the query results used in three separate queries to generate the summaries. Or the results may be used for drill-down, master-detail-like functionality where by using the results, it is not necessary to go to the database for the details. For example, information may be selected about departments and employees in a query and the results cached. The employee names may then be displayed in an application. When a user selects an employee, the application displays the employee details by selecting information from the cached query results without accessing the database.
In recent years, the object-oriented paradigm has been applied to database technology, using a programming model known as object databases. These databases attempt to bring the database world and the application programming world closer together, in particular by ensuring that the database uses the same type system as the application program. This aims to avoid the overhead of converting information between its representation in the database (for example as rows in tables) and its representation in the application program (typically as objects). Object Query Language (OQL) is a query language that is generally a standard for object-oriented databases and is modelled after SQL. This language allows for queries to be performed on objects, generating similar sets of query results, which can then be used in the applications as above. However, while relational databases have included the capability to push query results into temporary tables, object systems like Container Managed Persistence (CMP) and Java Persistence API (JPA), which are part of the Enterprise JavaBeans® Specification, and non-Object systems like JDBC allow data to be queried but the results from the query are second class. In other words, the results are not queryable in the same way as the original data. Unlike temporary tables created from SQL type queries, results objects generated from querying the object database do not contain a full description of the data. The results objects lack the descriptors and key values (metadata) of the original data, necessary for performing a query. In order to perform a query on the results objects, additional APIs or additional configuration must be included, generally through additional programming.
Therefore, there is a need in the art to be able to access and query object query results, in a first class manner, using the same mechanisms used for the original query.