The present invention relates generally to databases, and more particularly to address registers used by databases to control access to data contained within a database.
Databases typically belong to one of two major classes: object-oriented and relational. In an object-oriented database, an object typically consists of a unique object identifier (OID), coupled with a variable-sized block of bytes. In relational databases, data is typically stored in blocks of fixed sizes. Regardless of the type of database, it is a critical function of the database to keep track of the physical location in the storage medium of all data in the database. Both relational and object-oriented databases employ data block IDs to identify the blocks of data to be tracked. Databases generally track the physical location of data using one of two schemes: logical address registers (logical ID maps) and physical addresses.
Logical address registers use tuples to provide a one-to-one mapping between a logical address of a block of data and the physical address of that data. The database xe2x80x9crefersxe2x80x9d to the block of data by it""s logical name, or xe2x80x9clogical IDxe2x80x9d, which is used to look up the physical address of that block of data in the logical address register.
Because the use of logical address registers must be persistent across database xe2x80x9copensxe2x80x9d and xe2x80x9ccloses,xe2x80x9d using such a register requires two accesses of the storage media. This is because retaining the logical address register in volatile memory makes the database much less robust, and because in order to avoid serious limitations on the number of data blocks a database can track, the logical address register must be permitted to grow larger than what can be stored in the volatile memory of typical hardware systems.
Accessing the storage media is one of the bottleneck functions of a database, especially on distributed databases. Logical operations, and accessing of data stored in volatile storage occur much more rapidly than accessing of data stored in stable storage media, such as on a hard drive.
Physical addresses as part of data blocks"" IDs are therefore necessary for high-performance databases, in order to reduce the number of times the storage media must be accessed when a data block is referenced. Databases which use the physical address scheme for tracking data blocks use IDs which contain the actual physical address of the respective blocks (rather than the logical address), so that each reference to the block inherently contains the information necessary to physically locate the block on the storage media. In this way, the database can access the data block with only a single access of the storage media.
However, it is also necessary that a database be able to relocate blocks of data from one physical location on the storage media to another. For example, in object-oriented databases, the size of an object may outgrow the physical space available at its present location on the storage media. Also, a database""s performance can be enhanced by strategic relocation of data blocks. For example, data blocks which are related to each other are preferably located together on the storage media so that they can be accessed as a group. Since each reference to a block of data within the database must be identified and the ID of the data must be amended to reflect the new location of the data block concurrently with the relocation of the data block, the nature of tracking physical addresses by including the physical address as part of the data block""s ID makes movement of data blocks from one location on the storage media to another time consuming, and expensive in terms of consumption of hardware resources
Thus, there is a need for a database which employs a data tracking scheme which does not always require two accesses of the storage media in order to access a block of data, but which is able to relocate data blocks within the storage media more easily than is possible for databases relying on physical addresses as a part of the data block IDs.
In accordance with the present invention, a data tracking scheme for a database is provided which employs a xe2x80x9clast-known locationxe2x80x9d register as a part of a data block""s ID. In certain object-oriented databases embodying the present invention, for example, when an object is created, it is assigned a physical address, which is then included as an extension of the OID, and which is recorded in a logical address register. When the object is moved, rather than identifying every reference to the object within the database, only the physical address in the logical address register is updated. When a reference to the object is encountered during the operation of the database, the last-known-location extension of the OID is consulted for a valid last-known location, that is, a valid physical address. If such a valid last-known location exists, that physical location is accessed in order to retrieve the object. If the last-known-location extension of the OID contains an invalid last-known location, or if the physical address indicated contains something other than the desired object, the logical address register is accessed and the correct physical address is found. At this point, the reference to the object may (but need not) update the last-known address extension of the OID of the target object.
In another form of the invention, the database embodying to the present invention is a relational database. When a record in certain relational databases employing the data tracking scheme of the present invention includes a foreign key, for example, the record also includes a xe2x80x9chidden fieldxe2x80x9dxe2x80x94that is, a field accessible only by the database enginexe2x80x94containing any last-known physical address of the record identified by the foreign key. When the database needs to access a record identified by a foreign key, the database attempts to locate the desired record without referring to the index of records containing the needed record by looking for a physical address in the hidden field, and, if it finds one, the database looks for the needed record at that address. If the hidden field does not contain a valid physical address, or if the physical address it contains is inaccurate, the database locates the needed record through the appropriate index, and the physical address in the hidden field can be updated.
One object of the present invention is to provide a database capable of accessing data blocks in fewer than two accesses of the storage media, which is also capable of tracking the relocation of a data block by amending the physical address in only a single reference to the relocated data block. Other objects and advantages of the present invention will be apparent from the following description.