A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent files or records, but otherwise reserves all copyright rights whatsoever. The following notice shall apply to this document: Copyright (copyright)1998, Microsoft Corporation.
The present invention pertains generally to database technology, and more particularly to method and apparatus for facilitating database queries.
Database Management Systems (DBMS) such as Microsoft""s SQL Server provide for storing and managing information. The majority of applications that need to access data stored in such a system go through a data access layer such as the Microsoft Open Database Connectivity (ODBC) layer or the Microsoft OLE database (OLE DB). These layers provide many services that make writing applications that need access to information stored in databases much easier. These services include running database queries and fetching result sets without having to know the underlying protocols needed to communicate with the DBMS.
Most data access layers provide a way to pass database commands directly to the underlying DBMS. These commands are captured in a database language that is understood by the DBMS or can be readily translated using a DBMS driver. For example, ODBC uses SQL (Structured Query Language) as the language for issuing commands to the DBMS. Database query languages are comprehensive and cover operations from fetching simple data sets to backing/restoring of databases. Most database applications, however, use only a small subset of the database query language, specifically commands that fetch and update the data sets.
Referring to FIG. 1, there is shown a simplified diagram illustrating an example prior art configuration of a consumer application 10 which queries an underlying provider 12 such as a SQL database. Consumer application 10 may, for example, be a web interface allowing visitors to access data in provider 12. In the most common prior art configurations, consumer application 10 may include physical queries 16, such as coded data manipulation language (DML) queries, for example in SQL, that retrieve data from provider 12 in response to requests 18 for data entered by a user at the front end of the consumer application 10. In this manner, the user may be presented with a simplified front end interface with a set of standard data request options which allow access to data stored in provider 12 that may be of considerable complexity. Each data request option is typically implemented by a corresponding coded DML query. The user thus need not learn or understand how to create or write a DML request, as this is handled by the consumer application 10. This configuration works well when either or both the consumer application 10""s data request options and the structure or schema of data stored in provider 12 remain relatively static. When either or both of these elements are routinely modified or changed, it can be burdensome, or at least time consuming, to rewrite the coded DML queries to accommodate such changes. Furthermore, new versions of the DML interpreters (e.g. MS SQL Server) often have different performance characteristics than previous versions, requiring additional re-writing of DML to achieve or maintain good performance.
Writing applications that rely heavily on reading and writing information to a database back-end is not trivial. In such applications, consideration must be given to efficiently retrieving data from the DBMS, or at least efficiently transferring requests and data between the front-end consumer and the DBMS. For example, in a web site application it is desirable, for reasons of efficiency and speed, to keep to a minimum the number of round trips between the web server and a server maintaining the DBMS. Another challenge in implementing a DBMS accessed through a front-end consumer is maintenance of the front-end code. Such code must accommodate changes in the underlying structure or schema of the DBMS tables, columns and constraints, and changes to the data that is sought by the front-end consumer for display or delivery to a user. It is undesirable to rewrite front-end code each time such changes are needed.
An example of a DBMS accessed through a front-end consumer is a web site that provides customers with an interface to browse and buy books online, wherein information about the books is stored in a SQL server database. When a user asks to display information about a book the web server retrieves the information from the database, renders it using HTML, and sends it back to the user. This effectively means that just about every user request will result in a database query. Also, changing and improving such a web site usually means that the queries going to the SQL server have to change, resulting in code maintenance. For example, if a display of a book""s ISBN number is added to the web site, the query to the database must be changed to get the ISBN number in addition to the other information that had previously been retrieved. This may also mean that the underlying schema of the database might change. For example, if the ISBN number was not previously stored in the database it would have to be added, requiring a change to the underlying schema of the database. This change in the schema may require that any existing code for executing requests be updated or changed. Obviously, it is desirable if rewriting of the code required to query a DBMS can be kept to a minimum.
Therefore, there is a need for a system which can automatically generate queries based on a high level specification of the data required by the front-end consumer and from a high level description of the schema of the back end DBMS. One such system is described in the above referenced U.S. patent application Ser. No. 09/018,287, assigned to the same assignee as the present application, entitled xe2x80x9cAutomatic Generation of Database Queriesxe2x80x9d, and filed Feb. 4, 1998, (U.S. ""287) the entire disclosure of which is hereby incorporated herein. Generally, the system of U.S. ""287 allows a consumer application to submit xe2x80x9chigh-levelxe2x80x9d database queries to a xe2x80x9cschematizerxe2x80x9d program which in turn formulates xe2x80x9clower-levelxe2x80x9d queries for the DBMS, for example in the form of the SQL query language. This operation thus allows the schematizer to make changes in the underlying database transparent to the application, and simplifies the process of rewriting the application""s queries to the schematizer. U.S. ""287 is not admitted to be prior art to the present application.
According to one aspect of the invention there is provided method, apparatus and software for querying data stored on a computer readable medium, which includes creating a dictionary of conceptual information and physical information about the data, and storing the data on a computer readable data storage medium. Requests to access the data are in a form related to a conceptual organization of the data and are not specific to a physical organization of the data. Requests specific to the physical organization of the data are created to operate on the data, wherein these requests are formed by referencing the dictionary of conceptual and physical information about the data.
According to other aspects, the conceptual information is obtained from an object-relational-model of the data, and the physical information indicates how the data is organized on the data storage medium. Requests are written in a conceptual query language (CQL) which substantially uses terms belonging to or derived from a natural language.