Database management systems (DBMSS) are used to organize and manage large amounts of information. The data stored in databases is normally structured into records with predetermined fields. These fields identify the information in the records, and are normally organized into tables having rows and columns such that a query may be executed by a DBMS and the DBMS may generate a query response having query-satisfying information retrieved from the row(s) and column(s) associated with the tables.
A DBMS is an executable program stored on a data processing system. As is known to those skilled in the art, such a data processing system may include different hardware and software combinations. Users may access tabled information stored within a database which is operatively coupled to the DBMS by way of a user interface using, for example, a structured query language (SQL) or an XQuery and the like.
A given query may be parsed and compiled by a compiler contained within the DBMS, and as a result of compiling the given query the DBMS generates executable code which may be used for retrieving query-satisfying data which may satisfy the requirements of the given query. Typically, the DBMS may include a user command processor for processing user commands, such as queries, and executing such user commands against the database. Data processing system usable medium may contain executable code for directing the DBMS to perform algorithms related to operations of the DBMS. The data processing system usable medium may also store the database.
One way to manipulate and access a data collection stored within the database is to use a query, such as an SQL query. SQL queries may be of varying structure and complexity and may include many operators including operators that create, modify or delete data in the database.
With SQL queries, a UNION ALL operator specifies which data is to be retrieved from multiple independent sub-queries and presents a consistent set of columns and data-types to a parent operation of the SQL query. One usage of the UNION ALL operator is to combine together a large data set that has been divided into multiple smaller tables for a number of reasons including: limitations in the amount of data that can be stored in a single table; enhancing performance through reduced lock contention, disk performance, index utility and others; combining results from statistical calculations; and others. The UNION ALL operator may be used in the query to allow the parent operation to operate on a singular data object (i.e., a base table) even though data from multiple sub-queries and data objects may be retrieved. It is noted that the literature pertaining to UNION ALL operators and UNION operators identifies these operators using upper case, and it is understood that these operators may also be referred to in the lower case format.
Operators in a query language typically have one or more sources of input. When the UNION ALL operator is present in a query, it is typically one of the inputs to at least one valid operator in the query language. We refer to each of those operators which receive input from the UNION ALL operator as the parent operation. The specifics of the query language will determine what operators are valid as parent operators and what operation the parent operators will perform.
Data modifying operations, such as Update, Delete and Insert operations, require special processing when they are parent operations of a UNION ALL because the result of the data modification must be reflected in an actual data objects of the database. This requires the data modifying operator to have knowledge of the underlying structure of the UNION ALL operator and may also place restrictions on the UNION ALL operator and its sub-queries so that the data modifying operation can occur successfully. We refer to a UNION ALL operator that satisfies the requirements of an Update, Delete or Insert parent operator respectively as an Updatable, Deletable or Insertable UNION ALL operator.
A known method of directing the DBMS to process the UNION ALL operator contained in the SQL query is to process each of their sub-queries to produce their individual query results, combine those query results in a manner dictated by the UNION ALL operator, and finally flow the combined result to the parent operation. Data modifying operations must operate on data in actual data objects and determine for each tuple that is flowed from the UNION ALL operator which data object is affected. This solution is problematic because of the requirement for the data modifying operations to determine which data object it needs to operate on for every tuple.
Another known method of processing such operators is implemented outside of the DBMS by a controlling application in which the controlling application determines which data objects need to be operated thereon and instructing the DBMS on which data objects to access and/or modify and in which manner. This approach requires a great deal of complexity in the controlling application which makes the controlling application more difficult to develop and maintain. It also defeats the purpose of the UNION ALL operator, which hides the underlying structure from the controlling application, since the controlling application must now know the specification of the database design associated with the database.
There is a need for a data processing system, a data processing system implemented method and an article of manufacture for executing a query having a UNION ALL operator and data modifying operations.