1. Field of the Invention
The present invention generally relates to database management systems, and, more particularly, to mechanisms within computer-based database management systems for DBMS reorganization using peepholes and allowing concurrent data manipulation.
2. Description of Related Art
Databases are computerized information storage and retrieval systems. A Relational Database Management System (RDBMS) is a database management system (DBMS) which uses relational techniques for storing and retrieving data. RDBMS software using a Structured Query Language (SQL) interface is well known in the art. The SQL interface has evolved into a standard language for RDBMS software and has been adopted as such by both the American National Standards Organization (ANSI) and the International Standards Organization (ISO).
A typical relational database management system includes both database files and index files. The database files store data in the rows and columns of tables stored on data pages while index keys, used for faster reference of the data, are stored on index pages. A page is a physical unit of transfer between main storage and secondary storage. In such a table, the rows may correspond to individual records while the columns of the table represent attributes of the records. For example, in a customer information table of a database management system, each row might represent a different customer data object while each column represents different attributes of the customers, such as the name of a particular customer, the amount owed by the customer and the cash receipts received from the customer. The actions of a transaction that cause changes to recoverable data objects are recorded in a log file or data set.
The increasing popularity of electronic commerce has prompted many companies to turn to application servers to deploy and manage their applications effectively. Quite commonly, these application servers are configured to interface with a database management system (DBMS) for storage and retrieval of data. This often means that new applications must work with distributed data environments. As a result, application developers frequently find that they have little or no control over which DBMS product is to be used to support their applications or how the database is to be designed. In many cases, developers find out that data critical to their application is spread across multiple DBMSs developed by different software vendors.
Data in a database of a Database Management System (DBMS) can, over time, become unordered and make inefficient use of data storage space. This is rectified by a reorganization process where the data sequence order is restored and the data is distributed within the available data space based upon some predefined criteria.
One presently available reorganization method involves unloading the data, sorting it and reloading sorted data into the DBMS database. If there are indexes to the data records they are typically reorganized as part of this operation. This results in a perfectly organized database with perfect restoration of free space and empty pages, albeit for a short period of time. Moreover, the reorganization is highly disruptive because data are usually unavailable to applications for data updates during this process. Newer methods provide data availability by firstly reorganizing a copy of the data and then applying any updates since the copy was made, which requires significant temporary workspace.
Other conventional techniques reorganize the data in place while making the data available. These online reorganization methods result in a pretty good organization and restoration of free space and empty pages. It is less disruptive because data are only unavailable during a switch of shadow and original database. However, these current techniques require significant temporary workspace and the reorganization has to take place on the same processing unit as users' applications or on another processor, tightly coupled to that processing unit.
Therefore, there is a need for a method and a system using a non-disruptive DBMS reorganization technique that allows concurrent data manipulation, which is designed for a loosely coupled or auxiliary processor and that efficiently uses temporary workspace, thus optimizing data storage utilization and system efficiency.