1. Background and Relevant Art
Computer systems and related technology affect many aspects of society. Indeed, the computer system's ability to process information has transformed the way we live and work. Computer systems now commonly perform a host of tasks (e.g., word processing, scheduling, accounting, etc.) that prior to the advent of the computer system were performed manually. More recently, computer systems have been coupled to one another and to other electronic devices to form both wired and wireless computer networks over which the computer systems and other electronic devices can transfer electronic data. Accordingly, the performance of many computing tasks is distributed across a number of different computer systems and/or a number of different computing environments.
In some environments, queries are issued against a corpus of data to facilitate targeted information retrieval from the corpus of data. A user (or even a computer system) formulates a query using constructs from a query language. A query language typically includes a number of constructs that can be grouped into different combinations to express a logical intent for retrieving data. The query is issued to a data management system for processing. The data management system translates the query into a corresponding set of compatible physical operations (sometimes and hereinafter referred to as a “query plan”) for realizing the expressed logical intent. The query plan can then be executed to retrieve data from the corpus of data in accordance with the expressed logical intent. Retrieved data can be returned to the query issuer.
For example, SQL can be used to formulate a query for retrieving data from a relational database. The query is issued to a database management system that controls access to the relational database. The database management system translates the query into a query plan. The query plan is then executed to retrieve data from the relational database. The retrieved database data can be returned to the query issuer.
Some database systems are standalone (or single node) database systems where all data and optimization data is physically stored at the same machine. Other database systems are parallel database systems. In a parallel database system, database storage is spread across a number of compute nodes. Each compute node stores one or more portions of a database locally. Other modules (e.g., at a control node) abstract the distributed nature of the database from users such that it appears as a single unified database. As such, in a parallel database system, data relevant to a query as well as data used for query plan optimization can be spread out across a number of different nodes.
Supporting the execution of batched and stored procedures against a parallel database has at least a number of difficulties and/or inefficiencies. At least one difficulty is preserving equivalent single system behavior within a parallel database execution environment. The same behavior can be implemented at each compute node in a parallel database. However, implementing the same behavior at each compute can result in duplicated effort and performance of redundant operations.