In a typical database system, multiple users access or are xe2x80x9cattachedxe2x80x9d to a database at a single time. To provide read-repeatability in the database environment, each user attached to the database is provided a view of the database which remains unaffected by transactions of other users which modify the contents or structure of the database during a period of attachment by the initial user.
Generally, to maintain this read-repeatability, the database server maintains multiple versions or generations of a database, each reflecting a view of the database held by one or more users. As an update is made to the database, a new generation of the database is provided and a user attaching to the database subsequent to the update but prior to the next update attaches to the new generation.
Each generation includes one or more pages (or records) which provide the various values of the database table columns. A page is a unit of allocation, typically 4 k bytes or 8 k bytes, by which the information within the database is made available to a user. As updates are made to the database, these pages are superceded by subsequent versions of the page which reflect the changes made by the most recent update. In some instances, a difference between a new generation and its immediate predecessor may be a single change in one of these pagesxe2x80x94the remaining records which make up the two generations remaining identical.
A superceded page is maintained within the system until there are no users attached to a generation which references the superceded page. The superceded pages are erased when they are no longer referenced by attached users. One prior approach for identifying an instance during which no users are attached to a generation is to identify a time when no users are attached to the database. At this identified time when no users are attached, all superceded pages are erased and only a single, most recent generation of the database is retained. An extension to this approach is to erase these superceded pages upon first reference to a database.
In a typical database environment, instances during which no users are attached to the system occur infrequently. As a result, obsolete pages may be seldom erased yielding uncontrolled growth of the database as remnants of prior, inactive generations are saved.
A system and method for controlling database growth in a read-repeatable environment includes, for a given active generation, an indication of the pages allocated in that active generation which are updated by a subsequent generation of the database. From that indication, obsolete pages can be determined and removed from the database, possibly being reclaimed for re-use by the database.
In particular, a system and method controls database growth in a read-repeatable environment having a database comprising a plurality of pages of memory. The database can concurrently support a plurality of active ordinal generations, each active generation represented by a respective plurality of allocated pages. The method can be embodied in a computer readable medium for distribution.
The database is managed via a hierarchy of control structures. A global control structure stores information relevant to a global view of the database. Under the global control structure can be a plurality of generation control structures, each storing information relevant to a respective ordinal generational view of the database. The hierarchy can be further extended to include structures for managing, for example, local and session level views of the database.
The global control structure includes a reference, such as a pointer, to the latest active (current) generation of the database. The global control structure also includes a global erase indicator which identifies those pages of the database which are obsolete and can be removed from the database. The global erase indicator can be a bitmap field.
The generation control structure includes a reference, such as a pointer, to the prior active (ancestor) generation of the database. The global control structure also includes an allocation indicator and a generation erase indicator, which can both be bitmap fields. The allocation indicator identifies the pages which are allocated to the generation, while the generation erase indicator identifies which of the allocated pages have been updated by a subsequent generation of the database. The generation control structure can also include a user count, or other mechanism, to determine when the generation no longer has users accessing the data for the generation.
For each active generation of the database, an indication of each page allocated in the active generation is stored in the allocation indicator, and an indication of each allocated page of the active generation updated by a subsequent generation is stored in the erase indicator. These indicators can be bitmap fields in the control structure for each active generation.
From a generation erase indicator, an obsolete page can be determined. That process can be triggered by the process of detaching a final accessor (user) from the active generation. In particular, the indication of an updated page from the erase indicator of an active generation can be merged with the erase list of an active ancestor generation. Furthermore, the indication of an updated page from the erase indicator of an active generation can be merged with the global erase indicator. The active generation can then be de-activated.
Once identified, the obsolete page can be reclaimed for reuse by the database. The obsolete pages can be found in the global erase indicator. Upon creating a new generation, the global erase indicator can be processed to remove the pages from the database.
By utilizing generation erase indicators and a global erase indicator, obsolete pages of the database can be identified and removed from the database while users are attached to the database. As such, there is no longer a need to wait for a period of non-access before reclaiming obsolete pages. Run-time database growth can therefore be controlled while permitting read-repeatability of the database.