Recently a trend has developed to expand database systems to handle nontraditional data types (e.g. images, text, and audio data). In particular, it has become important to provide database systems that handle user-defined "large objects" (LOBs). LOBs may be much larger than traditional data types. For example, a single LOB may include four gigabytes of data.
Because of their size, LOBs cannot be efficiently handled with the same techniques used to handle traditional data types. For example, conventional database systems consist of one or more clients ("database applications") and a server (a "database server"). When a client requires data, the client submits a query to the server that selects the data. The server retrieves the selected data from a database and returns copies of the selected data to the client that submitted the query. When the selected data items are LOBs, the amount of data that would be returned to the user could be enormous. Consequently, automatically sending an entire LOB would be inefficient and time consuming, particularly when the client is only interested in viewing or updating a relatively small subset of the entire LOB.
The size of LOBs also results in space management difficulties within the database system. In typical database systems, it is important to be able to supply data items as they existed at a particular point in time. To do this, database systems typically either store data that allows data items to be reconstructed as they existed as of a given time, or store multiple versions of data items. In either case, the amount of data that would have to be stored to support LOBs could be enormous. The storage usage problems thus created can be mitigated by reclaiming space that is no longer required by LOBs. Consequently, it is clearly desirable to provide a mechanism for efficiently maintaining information about storage that can be re-used after the LOB data contained thereon is no longer needed.
LOB data may also be thought of as a file or a stream of characters or bytes. Applications are used to storing and accessing large amounts of data in a file, and the same is expected from LOBs. As in file access, applications require random, sequential piecewise access to LOB data. Also, file operations seldom copy the whole file, and the same behavior is expected of LOB operations.
One approach to handling LOBs may be to deliver to a client only a subset of the LOB. However, conventional retrieval mechanisms are designed to provide fast access to entire sets data items, such as rows, and not sub-portions of individual data items. Thus, even after a LOB is located, the time it would take to scan through the LOB to retrieve a particular subset of interest may be unacceptably long.
Another difficulty presented by the size of LOBs relates to how users are provided consistent views of a database that includes LOBs. Specifically, some database systems provide consistent views of the database to users by generating undo records when data items are updated. When applied to an updated item, the undo record reconstructs the data item as it existed before the update. Consequently, a user can be shown the database as of a particular point in time by applying one or more undo records to data items requested by a user that have been updated since that point in time.
Unfortunately, as a general rule, the larger the updated data item, the larger the undo record that must be generated in order to undo the update. Consequently, generating undo records for LOBs is inefficient and impractical due to the amount of data that would have to be generated and stored in response to every update.
Based on the foregoing, it is clearly desirable to provide a mechanism to efficiently access LOBs and desired portions within LOBs. It is further desirable to provide a mechanism for reconstructing a consistent view of a database that includes LOBs.