1. Field of the Invention
This invention relates in general to computer-implemented database systems, and, in particular, to retrieving and processing large object data by using a stored data length in a computer.
2. Description of Related Art
Databases are computerized information storage and retrieval systems. A Relational Database Management System (RDBMS) is a database management system (DBMS) which uses relational techniques for storing and retrieving data. Relational databases are organized into tables which consist of rows and columns of data. The rows are formally called tuples or records. A database will typically have many tables and each table will typically have multiple tuples and multiple columns. Tables are assigned to table spaces. A table space is associated with direct access storage devices (DASD), and, thus, tables, are stored on DASD, such as magnetic or optical disk drives for semi-permanent storage.
A table space can be a system managed space (e.g., an operating system file system) or a database managed space. Each table space is physically divided into equal units called pages. Each page, which may contain, for example, 4K bytes, holds one or more rows of a table and is the unit of input/output (I/O). The rows of a table are physically stored as records on a page. A record is always fully contained within a page and is limited by page size.
Traditionally, a DBMS stored simple data, such as numeric and text data. In a traditional RDBMS, the underlying storage management has been optimized for simple data. More specifically, the size of a record is limited by the size of a data page, which is a fixed number (e.g., 4K) defined by a computer developer. This restriction in turn poses a limitation on the length of columns of a table. To alleviate such a restriction, most computer developers today support a new built-in data type for storing large objects (LOBs). Large objects, such as image data, may take up a great deal of storage space. As a result, users frequently compress LOB data. Compressed LOB data takes up less storage space and fits within fewer pages.
DBMSs use a variety of models to retrieve compressed LOB data. The models typically force decompression of the compressed LOB data to determine where a particular byte or range of bytes are stored within the LOB table space. Such decompression may lengthen the time of processing LOB data. For example, the RDBMS has a number of built-in functions that simplify or automate some types of data processing. Typical built-in functions include column functions and scalar functions. A column function returns a single value as a result. An average (AVG) function is an example of a column function. The AVG function calculates the average of a column value for multiple rows. Like a column function, a scalar function produces a single value as a result. However, a column function operates on one column for multiple rows and a scalar function operates on one column in a single row. A substring function is an example of a scalar function. The substring function enables a user to extract a portion of the LOB data.
Processing compressed LOB data is generally slow. Before processing can begin, the LOB data is decompressed to determine where a particular byte or range of bytes are stored within the LOB data. Decompression consumes a considerable amount of elapsed time. Therefore, there is a need in the art for an improved technique for accessing decompressed data.
To overcome the limitations in the prior art described above, and to overcome other limitations that will become apparent upon reading and understanding the present specification, the present invention discloses a method, apparatus, and article of manufacture for retrieving data in a computer.
In accordance with the present invention, large object data is compressed until the large object data fits within one data page. An uncompressed large object data length is stored in a large object map, wherein the stored uncompressed large object data length is associated with the compressed large object data. A portion of the compressed large object data is located for performing a data processing function by using the stored uncompressed large object data length. The portion of the large object data is stored in the database.