Fundamentally, a database is a computerized record-keeping system in which large amounts of information may be stored in a structured manner for ease of subsequent retrieval and processing. Large databases are generally managed through data base management systems (DBMS's). A DBMS provides an operational environment through which a user may retrieve or update previously stored information. In one type of DBMS, referred to as a relational database system, information is stored in tables, with each table having one or more columns and one or more rows. Each column in a table is referred to as an attribute of the table. Each row in a table is referred to as a record. Thus, a table typically comprises a plurality of records (rows), each of which has a plurality of attributes (columns). By way of example, a business might maintain a database of employee information. In this example, each record may be associated with an employee, and the attributes of each record can identify information such as the employee's name, social security number, address, employee number, department, position, salary, hire date and any other information the business deems useful.
One well-known example of a relational database system is the DB2® database environment. (DB2® is a registered trademark of International Business Machines Corporation of Armonk, N.Y.). Within a DB2® environment, database tables are created within a tablespace. One type of tablespace within the DB2® environment is a segmented tablespace, which may contain more than one DB2 table. The segments are data structures or groups of pages that hold rows of a single table so that rows of different tables are not stored in the same segment. A page is a data structure or a unit of storage within the tablespace. The size of the pages is determined at tablespace creation and may be specified as 4K, 8K, 16K, or 32K, for example.
In a DB2® environment, there often is a need to load large amounts of data into DB2® tables. For example, data is loaded to initially populate a table, to periodically replace the data in a table, to periodically add data to a table, or to reload data into a table. This data can originate from a variety of sources, such as other DB2® tables (possibly from a different DB2® system) or other database management systems (DBMSs), including IMS or distributed-systems DBMS applications that generate sequential files. In many cases, users load millions of rows at a time. Because data in the tablespace is often unavailable to applications during the loading process, the loading process is preferably completed as quickly as possible.
LOAD utilities are known in the art for loading data into a tablespace. DB2® LOAD utilities include those by BMC Software, IBM, Computer Associates and CDB Software, Inc. LOAD utilities load data from a source (typically a sequential file) into one or more tables of a tablespace. If the tablespace already contains data, users can choose to add the new data to the existing data using a LOAD RESUME operation in which rows of new data are added at the end of the already existing data. Alternatively, the user can choose to replace the existing data of the tablespace using a LOAD REPLACE operation in which all the existing data for all tables in the tablespace are replaced by new rows.
Referring FIG. 1, a LOAD RESUME operation according to the prior art is schematically shown performed on a segmented tablespace 10. The RESUME option tells the LOAD utility to add new table data A′ to an existing tablespace 10, which may be empty or may already contain table data A. For example, a RESUME NO operation tells the LOAD utility that the tablespace is empty, whereas as RESUME YES operation indicates that the tablespace might or might not be empty. Using LOAD RESUME, the new table data A′ is written after the end 12 of the existing tablespace's data, as shown in the resultant tablespace 14.
Referring to FIG. 2, a LOAD REPLACE operation according to the prior art is schematically shown on a segmented tablespace 20. The REPLACE option tells the LOAD utility to delete the existing data before loading—writing over all tables in the tablespace starting at the beginning of the tablespace 20. Therefore, if an existing tablespace 20 has tables A and B and a LOAD REPLACE operation is invoked with new table data A′, the LOAD utility loads the new table data A′ starting at the beginning of the tablespace and all other data is deleted, as shown in the resultant tablespace 22.
As noted above, users of DB2® databases currently cannot use a LOAD utility to replace some data tables (e.g., a REPLACE operation) and augment other tables (e.g., a resume operation) in a single load job when those tables reside in a common multi-table segmented tablespace. If using a LOAD utility is desirable, the user must unload rows for the unchanging tables and replace the entire tablespace with rows for all desired tables. Alternatively, the user must somehow delete the rows of the target tables and then use a LOAD utility to add the new rows to the tablespace. These prior art techniques are described below.
Referring to FIG. 3A, one prior art technique for replacing a table on a segmented tablespace using a LOAD utility without destroying data of other tables is schematically illustrated. In this technique, the user stores different tables (table A and B) in separate segmented tablespaces 26 and 28. In this way, a LOAD REPLACE operation can be performed against table A in tablespace 26 without affecting table B in tablespace 28. While this approach permits a user to modify and/or replace table data without concern or danger of modifying data in other tables, it is not an efficient use of resources.
Referring to FIG. 3B, another prior art technique is schematically illustrated. In this technique, the user uses structured query language (SQL) commands in a first pass (Pass 1) of the tablespace 30 to delete all the rows of a targeted table (e.g., table A). Then, in a second pass (Pass 2) of the tablespace 30, a LOAD RESUME operation is used on the resultant tablespace 32 to load the new table data A′, which is appended to the end 34 of the segmented tablespace, as shown in the resultant tablespace 36. However, this prior art technique requires a number of separate operations and is not an efficient use of resources.
Referring to FIG. 3C, yet another prior art technique is schematically illustrated. In this technique, the user unloads all the data of the tables (e.g., tables A and B) from the tablespace 40 in a first pass (Pass 1) of the tablespace 40. The unloaded table being replaced (e.g., table A) is discarded. Then, in a second pass (Pass 2) of the tablespace 40, the user uses a LOAD REPLACE operation to load the new table data A′ and the unchanging table data B into the tablespace, as shown in the resultant tablespace 42. Again, this prior art technique requires a number of separate operations and is not an efficient use of resources.
Therefore, a need exists for a feature that will allow a table or multiple tables to be replaced in a multi-table segmented tablespace without modifying other tables residing in that tablespace. Furthermore, a need exists for a feature that will allow a user to use a LOAD utility to load a pre-existing tablespace replacing all rows in specified tables and adding rows to other tables and that will perform the load in a single pass of the tablespace. The subject matter of the present disclosure is directed to overcoming, or at least reducing the effects of, one or more of the problems set forth above.