A database system consists of a database and a database management system (DBMS). A database is an organized collection of data. A special type of a database is a relational database. In a relational database, the collection of data is organized into tables that consist of rows and columns of data. A row of data in a table is also referred to as a data record. The table columns can have a fixed length or a varying length. If at least one of the columns of a table has a varying length or if the table is compressed, then the data records can also have a varying length. A relational database typically consists of many tables and each table generally contains multiple rows and columns. Tables are the logical objects that applications that access a database perceive.
A database management system is a computer program which is employed to manage and query a database. A relational database management system is a special kind of database management systems which is used to manage and administrate a relational database. It is a database management system that employs relational techniques for storing data in and retrieving data from a relational database.
Table spaces are the physical objects in which the data is stored. A table space contains one or more tables and is typically divided into a plurality of data pages and into several space map pages. Each data page belongs to a single table only. In a multi-table table space, each space map page may cover data pages that belong to different tables. However, for a single-table table space, each space map belongs to a single table only. In many cases the data pages of a table are co-located in so-called segments, which contain data pages of a single table only. A space map page covers data pages of one or more segments. All data pages within a single table space are characterized by the same page size. Data in form of data records is stored in one of the data pages, usually on a single data page. A data page can contain one or more data records. Each space map page of the table space relates to a fixed number of data pages. The fixed number of data pages to which a space map page relates to is also called a set of data pages. A space map page maintains free space information on the corresponding set of data pages. Database systems exploit the space map pages in order to identify data pages that provide enough storage space for additional data records to be inserted. In order to avoid excessive updates of the space map pages and in order to maximize the number of data pages covered by a single space map page, each space map page maintains only a small number of values that approximate the amount of available storage space of a data page. Thus the space information provided by the space map page is not completely accurate. In a table space, a space map page usually precedes the data pages that it covers. Instead of a single space map page, a set of data pages may also be covered by a space map that spans multiple pages, which conceptually does not make a difference.
When an application requests the database management system to insert a data record into a table of the database, one or more space map pages that cover the table are accessed for choosing a suitable data page for insertion of the data record. Some database management systems as for example the DB2 database management system from IBM Corporation use a so called clustering index in order to identify a data page into which a data record should be stored. The database management system attempts to insert the data record in the data page where adjacent key values already reside. Maintaining data records in the clustering key order enables more efficient data retrieval when the clustering index is used to retrieve a set of data records within a key range.
If there is no storage space available on the data page that is identified by use of the clustering index, it is preferable to insert the data record within close range of this data page. If the table is later accessed by use of a blocked read technique such as sequential prefetch, the inserted record would then still be read in the same input/output operation. If all data pages within close range cannot accommodate the new data record, allocated space is reused for insertion of the new data record if possible in order to avoid the growth of a table space. If the allocated space is optimally reused, insertion operations can often result in a scan of many if not all space map pages of the table space which is usually very time consuming.
The approach disclosed in the patent U.S. Pat. No. 6,668,263 basically describes a method which remembers the last set of space map pages that have been searched and limits the total amount of space searched when inserting a new record. In this way it is ensured that the performance of the database does not degrade too much. However, data pages with sufficient free space are found only if they are covered by the last set of space map pages that has been searched previously.
In the patent U.S. Pat. No. 6,334,134, a method for insertion of a data record in a parallel database system is disclosed. Distinct page ranges are assigned to the different instances of the database system. Each instance of the parallel database system inserts a new data record on those data pages only that are assigned to this instance and ignores the clustering index. It thereby avoids an overhead of the synchronization between the instances comprised in the parallel database system. However, the method does not provide an improved method for finding data pages that could accommodate new data records.
There is therefore a need for an improved method and data processing system for efficiently searching a data page that provides sufficient storage space for inserting a new data record.