1. Field of Invention
The present invention relates to database systems. More specifically, the invention relates to enabling data in variable arrays (VARRAYs) to be efficiently stored, particularly when a size of a VARRAY column is not specified.
2. Description of the Related Art
The amount of memory space available in computing systems is typically limited, so it is crucial to efficiently allocate the memory space for usage. As such, the determination of where and how to store data within a computing system may be based upon how much data is to be stored. For example, in some instances, a relatively small amount of data may be stored as inline data in a column of a table, whereas a relatively large amount of data may be stored as an object that is referenced by the table.
A variable array or VARRAY is a variable-length ordered list of values of one type that may be specified by a user and created in a database, as for example a database available from Oracle, Inc. of Redwood Shores, Calif. That is, a VARRAY is an array, or an ordered set of data elements of the same type, of variable size. Each element in a VARRAY generally has an index, which is a number that corresponds to the position of the element in the VARRAY. The number of elements in a VARRAY is typically considered to be the size of the VARRAY. Typically, when a VARRAY type is declared, a maximum size for the VARRAY, or an array limit, is specified.
When creating a VARRAY, or a table of VARRAY columns, if there is no storage clause for the VARRAY, i.e., if a user declines to specify an array limit, a VARRAY column is created based on a maximum possible size for the VARRAY column. The maximum possible size is generally based on the element size of elements to be stored in the VARRAY, a maximum number of elements that may potentially be stored in the VARRAY, and a necessary amount of system control information.
FIG. 1 is a process flow diagram which illustrates one method of creating a VARRAY, or a table of VARRAY columns, and storing data associated with the VARRAY. A method 10 of creating and using a table of VARRAY columns begins at step 14 in which instructions to create a VARRAY type, VARRAY, are received. Once the instructions are received, it is determined whether the array limit for the VARRAY type, or a maximum number of elements which may be contained in a VARRAY, is specified in step 18.
If the determination in step 18 that the size of a VARRAY column is specified, then a VARRAY column is determined to be created based on the specified size in step 22. Alternatively, if it is determined in step 18 that the size of the VARRAY column is not specified, i.e., that there is no storage clause for the table of VARRAY columns, then a VARRAY column is determined to be created in step 26 based on a maximum possible array size. Typically, the maximum possible array size may be determined by a maximum number of elements multiplied by the maximum size of an element, plus bytes needed to store system control information.
From steps 22 and 26, process flow moves to step 30 in which it is determined if the maximum size of a VARRAY column is less than approximately 4K bytes. As will be appreciated by those skilled in the art, the maximum size of data that may generally be stored inline per row for a column is approximately 4K bytes. If it is determined that the maximum size of a column is less than approximately 4K bytes, data is stored in the VARRAY column as inline raw data in step 38, and the method of creating a table of VARRAY columns and storing data in the VARRAY columns is completed. Alternatively, if it is determined that the maximum size of a VARRAY column is greater than or approximately equal to 4K bytes, then data is effectively stored in the VARRAY column as a kernel internal large object (LOB), and the method of creating a table of VARRAY columns and storing data in the VARRAY columns is completed.
When an array limit, or a maximum number of elements, for a VARRAY type is not specified when the VARRAY type is created, a LOB is generally allocated for data storage if the maximum possible size of each VARRAY column is greater than approximately 4K bytes. With reference to FIG. 2, a LOB which is allocated for storage for a VARRAY column will be described. A VARRAY 200 that either has a maximum possible size of approximately 4K bytes or more when no array limit is specified, or has a specified array limit of approximately 4K bytes or more, is arranged to have data stored in an object space 204 as a LOB 208. When a LOB 208 is allocated for the storage of VARRAY data 216, LOB header information or overhead 212 adds at least another several bytes, as for approximately thirty bytes, to the size of LOB 208.
When LOB 208 is allocated because VARRAY 200 has a maximum possible size of approximately 4K bytes or more, and no array limit is specified, VARRAY data 216 stored in LOB 208 may be much less than approximately 4K bytes. That is, the size of actual VARRAY data 216 may be suitable for storage as a LOB locater with inline data, even though VARRAY data 216 is stored in LOB 208 because LOB 208 has been allocated. When VARRAY data 216 is significantly less than approximately 4K bytes, as for example less than approximately thirty bytes, but stored in LOB 208, the extra bytes associated with header overhead 212 may represent significant storage overhead. When header overhead 212 represents relatively significant storage overhead, performance issues may arise during data write and data read operations. Such performance issues may adversely affect the efficiency with which data write and data read operations may be performed.
Storing data in a LOB segment, i.e., out of line, often has significant performance impacts due to additional I/O operations which are used to fetch the stored data. In addition, storage in a LOB segment is typically allocated in discrete chunks, e.g., in some multiple of a database block size. Hence, even if an amount of data to be stored is significantly less than the size of the discrete chuck into which the data is to be stored, the entire chunk would be allocated, which results in significant overhead.
VARRAY types generally have an evolution feature which enables a user to alter, e.g., increase, the size of a VARRAY type. The size of a VARRAY type may be increased in response to an increase in the size of an element type stored in VARRAY columns, or generally in response to an increase in a limit size of a VARRAY type. When the limit size of a VARRAY type is increased and results in the maximum size of a VARRAY column increasing to approximately 4K bytes or greater, then VARRAY data that is previously stored in the VARRAY columns generally is no longer stored as inline raw data, and is instead stored in a LOB. a LOB
FIG. 3 is a diagrammatic representation of how VARRAY data is stored before and after a limit size of a VARRAY column is increased from less than approximately 4K bytes in size to at least approximately 4K bytes in size. A VARRAY column 300 that is less than approximately 4K bytes in size is arranged to have its data stored inline in a table space 330. When a limit size of VARRAY column 300 is increased, as for example to approximately 4K bytes or greater in size, VARRAY column 300′ which is approximately 4K bytes or greater results. In addition, a LOB 346 in an object space 342 is created to store data associated with VARRAY column 300′.
Existing VARRAY data, which is stored in table space 330′ as inline raw data, is effectively moved into LOB 346. In general, each row of existing VARRAY data of a column is stored in a separate LOB 346. A LOB locator 338 is then stored into table space 330′ to identify LOB 346.A LOB
The need to essentially modify a VARRAY image to be stored in LOB 346, and to insert LOB locator 338 into table space 330′ is often inefficient, as the process of storing previously inlined raw data into LOB 346, and inserting LOB locator 338 into table space 330′ is relatively slow. When there are a relatively large number of rows associated with VARRAY column 300′, the process of storing VARRAY data that was previously inlined as raw data in table space 330 may be particularly time consuming. Further, evolving a VARRAY type typically requires that existing data be modified, which is also time consuming and may utilize a significant amount of overhead.
Therefore, what is needed is a method and an apparatus which allows VARRAY types to be efficiently created, particularly when array limits are not specified, and also allows for less time-consuming evolution features associated with the VARRAY types. That is, what is needed is a system which substantially minimizes the occurrences of storing relatively small amounts of data in kernel internal LOBs, and also significantly reduces the need to effectively copy inline raw data into kernel internal LOBs when a VARRAY limit size is increased. typically time consuming operations relating to data stored in VARRAY columns