1. Field of the Invention
The present invention relates generally to data processing environments and, more particularly, to a database system with methodology for parallel schedule generation in a query optimizer.
2. Description of the Background Art
Computers are very powerful tools for storing and providing access to vast amounts of information. Computer databases are a common mechanism for storing information on computer systems while providing easy access to users. A typical database is an organized collection of related information stored as “records” having “fields” of information. As an example, a database of employees may have a record for each employee where each record contains fields designating specifics about the employee, such as name, home address, salary, and the like.
Between the actual physical database itself (i.e., the data actually stored on a storage device) and the users of the system, a database management system or DBMS is typically provided as a software cushion or layer. In essence, the DBMS shields the database user from knowing or even caring about the underlying hardware-level details. Typically, all requests from users for access to the data are processed by the DBMS. For example, information may be added or removed from data files, information retrieved from or updated in such files, and so forth, all without user knowledge of the underlying system implementation. In this manner, the DBMS provides users with a conceptual view of the database that is removed from the hardware level. The general construction and operation of database management systems is well known in the art. See e.g., Date, C., “An Introduction to Database Systems, Seventh Edition”, Part I (especially Chapters 1-4), Addison Wesley, 2000.
One purpose of a database system is to answer queries requesting information from the database. A query may be defined as a logical expression over the data and the data relationships set forth in the database, and results in the identification of a subset of the database. Consider, for instance, the execution of a request for information from a relational DBMS. In operation, this request is typically issued by a client system as one or more Structured Query Language or “SQL” queries for retrieving particular data (e.g., a list of all employees earning more than $25,000) from database tables on a server. In response to this request, the database system typically returns the names of those employees earning $25,000, where “employees” is a table defined to include information about employees of a particular organization. The syntax of SQL is well documented, see e.g., “Information Technology—Database languages—SQL”, published by the American National Standards Institute as American National Standard ANSI/ISO/IEC 9075: 1992, the disclosure of which is hereby incorporated by reference.
SQL queries express what results are requested but do not state how the results should be obtained. In other words, the query itself does not tell how the query should be evaluated by the DBMS. Rather, a component of the DBMS called the optimizer determines the “plan” or the best method of accessing the data to implement the SQL query. The query optimizer is responsible for transforming an SQL request into an access plan composed of specific implementations of the algebraic operator selection, projection, join, and so forth. The role of a query optimizer in a relational DBMS system is to find an adequate execution plan from a search space of many semantically equivalent alternatives.
Relational database queries are broadly classified into simple transactional queries found in online transaction processing (OLTP) environments, and complex queries found in operational decision support system (DSS) environments. Although existing database systems are in wide use in DSS applications and in OLTP applications, there is a growing user demand for supporting both types of queries in a single system. Users need a solution capable of handling complex queries and also having the ability to process large data sets for both local and distributed systems. They are looking for a robust database server system platform for running mixed workload applications that demand superior performance for queries from both OLTP and DSS domains, sometimes across distributed and heterogeneous database servers. This environment is referred to as an operational decision support system (operational DSS), since it allows running complex queries as well as performing regular OLTP processing.
What is needed is an improved query optimizer that helps meet the demands placed on the database system in this type of environment which involves running complex queries as well as regular OLTP processing. Since it is not always possible to predict the type of workload, it is important to support both OLTP and DSS queries in the same configuration of the data processing system to efficiently support workloads of all types. One of the issues to be addressed in this type of processing environment is the ability to predict resource usage and adjust the query plan accordingly. The present invention provides a solution for these and other needs.