1. Field of the Invention
This invention relates to a database management system for processing SQL queries for relational databases, and more specifically to compiling the queries and processing the queries at run-time where the queries reference a table entity that is unknown at compile time.
2. Description of the Related Art
Databases are computerized information storage and retrieval systems. A relational database management system (RDBMS) is a database management system (DBMS) which uses relational techniques for storing and retrieving data. Relational databases are organized into tables which consist of rows and columns of data. The rows are formally called tuples. A database will typically have many tables and each table will typically have multiple columns. The tables are typically stored on random access storage devices (DASD) such as magnetic or optical disk drives for semi-permanent storage.
A DBMS is structured to accept commands to store, retrieve, and delete data. One widely used and well known set of commands is called the Structured Query Language (SQL). The current SQL standard is known informally as SQL92. A new proposed standard is referred to as SQL3.
An SQL query can declaratively specify the contents of a view. For relational databases, a view is essentially a virtual table having virtual rows and virtual columns of data. Although views are not directly mapped to real data in storage, views can be used for retrieval as if the data they represent is actually stored. A view can be used to present to a user a single logical view of information that is actually spread across multiple tables.
Tables are at the center of a relational database engine; and a major goal of the underlying query compiler is to provide a suite of mechanisms so that user data can be easily stored and efficiently manipulated. Traditionally, two types of tables, base table and derived table, are supported in an SQL-based database server. A base table can be either user-defined for storing user data or system-defined for storing information about objects created in user databases. User-defined base tables are created explicitly by means of the create table statement, while system-defined base tables, usually referred to as catalogs, are implicitly created at the time when a user database is created. In contrast to base tables, derived tables are defined in terms of existing base tables and other derived tables. They can be defined implicitly by the system or explicitly by users. Implicitly defined derived tables are created during the execution of table operations to store intermediate results, and explicitly defined derived tables are specified by users through the use of the create view statement.
Regardless of the type and creator, tables are manipulated uniformly by means of the Data Manipulation Language (DML), such as those defined in the SQL standards. (See, (ISO-ANSI Working Draft) Database Language (SQL2); Jim Melton, editor; International Organization for Standardization and American National Standards Institute, 1992. See also, ISO/IEC JTC1/SC21 N10489, Committee Draft (CD) Database Language SQL--Part 2: SQL/Foundation; Jim Melton, editor, July 1996.)
With the rapid growth of database applications, the concept of tables has been generalized in database languages. In the upcoming SQL3 standard (see, ISO/IEC JTC1/SC21 N10489, Committee Draft (CD) Database Language SQL--Part 2: SQL/Foundation; Jim Melton, editor, July 1996), tables can be defined by means of a set-valued function written in SQL/PSM or in a host language such as C, C++, or Visual Basic. This concept is referred to as a "table function". It not only provides a more general way to compose new tables from existing tables (than through the use of view definitions), but it also allows accesses to external data using the same query mechanisms. (However, it should be noted that the usage of external table functions is restricted so that data integrity can be guaranteed.)
As such, the SQL92 standard and the upcoming SQL3 standard have defined various kinds of new table concepts that include: table functions, user-defined temporary tables, and table parameters and table result of table functions. These new table concepts create a new scenario of table usage. The actual table entity referenced by a table reference can not be known at compile-time. Any such table that is unknown at compile time that is being referenced is referred to herein as a dynamic table. Such a new table concept engenders a new challenge to both the existing query compiler for the treatment of un-resolved table references and to the existing run-time environment for the resolution of such references.
Such a problem is similar to external name references that already exist today in most programming languages. For example, dynamic binding (linking) exists today in programming languages. For example, if a program is written in C or Fortran, a programmer can define an object in one compiled module and use that object in a different compiled module where the two modules are compiled separately. At run-time there will be a binding process to link the two together.
However, traditionally in the past in the SQL language, there was no external reference to other tables defined outside of the compilation unit. In application programs, such as when the C language is used as the host program, there are individual SQL statements guarded with key words (e.g. EXEC SQL). A preprocessor goes through the program and picks up each of these statements. For each SQL statement, the preprocessor sends it to the SQL compiler. Each statement is compiled separately and is self-contained. As a result, a plan is generated which is evaluated by the interpreter at run-time to get the right results. In the past, the way a SQL program has been used has been very restrictive because it has been processed one statement at a time.
Recently there have been discussions to add more items into the SQL language to make it more powerful. These include extending the SQL language with procedural statements (e.g., IF, THEN, ELSE, GO TO, a sequence of statements, procedure, function, etc.). The SQL standards committee is also introducing more constructs for defining and using tables. Consequently, one statement can define or create a table which is accessed by another statement that is separately compiled. In essence, the problem is that there can be two SQL statements that are separately compiled that have to communicate through tables. Any such implementation to resolve this problem is different from dynamically linking objects that are separately compiled in modules written in host languages because in the problem at hand the object is not a variable or a function, it is a table. Such a new way of table references can not be handled simply and efficiently by the present level of functionality in both the existing query compiler for dealing with table references and the run-time environment for resolving them.