U.S. Pat. Nos. 6,128,621, 6,112,207 and 6,377,953 are each incorporated by reference herein in their entirety as background.
As stated in U.S. Pat. No. 6,128,621, computer programming languages commonly represent information structures using abstract data types (“ADTs”). A detailed discussion of ADTs is found in a book by N. Wirth, entitled “Algorithms+Data Structures=Programs” (Englewood Cliffs, N.J.: Prentice-Hall, 1976), also incorporated by reference herein in its entirety as background. ADTs may be specified in a “create type” command to a database, as described in sections entitled “CREATE TYPE” and “CREATE TYPE BODY” in ORACLE 9i SQL Reference, Release 9.2, Part Number A96540-02 also incorporated by reference herein in its entirety as background. See also an article entitled “Object SQL—A Language For The Design And Implementation of Object Databases” by Jurgen Annevelink et al. in a book entitled “Modern Database Systems” published 1995, pages 42-68 also incorporated by reference herein in its entirety as background. See also U.S. Pat. No. 6,782,394 granted to Landeck et al. on Aug. 24, 2004 and 6,470,348 granted to Govindarajan et al. on Oct. 22, 2002 both of which are incorporated by reference herein in their entirety.
An example that uses ADTs is illustrated in FIG. 1A. Specifically, as shown in FIG. 1A, a programmer 101 issues to computer 100 a command to create ADT 111 called “Address” which contains four fields each of which has a data type, such as “char” and “integer” that is native to computer 100. In this example, all “char” fields (such as Street, City and State) store strings (for example up to 80 characters long). The field for a zip code is stored as an integer value that is 32 bits long. Such ADTs, which are initially defined by a programmer 101, are also referred to below as UPTs and they can be used to derive a more specific type of ADT from a base type (in Object Oriented programming terms). Specifically, a UPT is a pointer to an ADT where the ADT can be used to derive other types (that is it can be polymorphic). In the database system “Oracle9i” one would create this ADT using the create type SQL statement by having a “NOT FINAL” clause at the end.
Commands from programmer 101 are first converted into dynamic (run-time) objects called Type Descriptor Objects (TDOs) that are temporarily held in volatile memory and eventually stored persistently in a dictionary of a database. A TDO contains an Attribute Definition in the form of information shown in FIG. 1C including a pointer to, a Type Descriptor Segment (TDS). Each TDO holds metadata about a corresponding ADT that is defined by programmer 101. In the example illustrated in FIGS. 1A and 1B, the reference number for a TDO is obtained by adding 10 to a reference numeral that identifies a corresponding command to create the ADT. In this particular example, a command 111 to create Address ADT in FIG. 1A results in creation of Address TDO 121 in volatile memory 110 of FIG. 1B.
Referring to FIG. 1A, after issuing a command 111 to define the above-described Address ADT, programmer 101 may define one or more additional ADTs that use (i.e. inherit) the Address ADT. For example, FIG. 1A illustrates an ADT called “Person” which is created in response to a command 112 by programmer 101 to hold a character string field for a person's name and the just-described address ADT. Note that use of Address ADT in the definition of person ADT is accepted by computer 100 at this stage because ADT 111 was earlier defined by programmer 101. Note also that, if an ADT that is being processed (also called top-level ADT, i.e. ADT at the lowest depth) has multiple copies of an ADT embedded at different levels, then the corresponding TDO for the top-level ADT contains multiple copies of the same object. Specifically, each TDO of the prior art, as shown in FIG. 1B, holds all metadata required to interpret a data object of that ADT, including metadata for each embedded ADT. In the example, TDO 123 for an Employee ADT contains metadata not only for attributes of Person, but also for attributes of Address (which is an attribute of Person). Similarly, TDO 124 for the Manager ADT contains metadata for Person twice, once for the Person attribute in Manager, and another time for the Person ADT embedded within the Employee attribute in Manager. For this reason, when programmer 101 issues three commands 111, 112 and 113, prior art computer 100 prepares metadata describing the Address ADT three times, once in each of TDOs 121, 122 and 123.
TDOs 121-125 held in memory 110 also contain a pointer to a Type Descriptor Segment “TDS” that allows one to determine the attributes within the TDO object by use of an opcode. TDS is a description of metadata for use in interpreting the contents of object data, it has a length/data tuple describing each attribute. Note that an XML document that contains repeated definitions of object types may be parsed for representation in a database, with each XML tag or type is represented as a TDO in Oracle. TDSs are described in U.S. Pat. No. 6,128,621 (incorporated by reference herein in its entirety). Specifically, U.S. Pat. No. 6,128,621 describes a “pickler” that receives a TDS as input and prepares a serialized description that can be either transmitted or written to disk. U.S. Pat. No. 6,128,621 states that preferably, the TDO is a table of a database, and each TDS is a record or row of the table. The TDS comprises fields that correspond to attributes of the ADT. Each ADT attribute, if native to computer 100, when described in-line in the TDS can be represented in such a database as a column in the TDO table. As noted above, if an attribute specified in a “create type” command is itself an ADT (such as a UPT that contains native data types or one or more ADTs embedded therein), then that attribute is “flattened” on conversion into TDS. Note that the above-described ADTs (and hence the corresponding TDOs and TDSs) may support inheritance of the type found in an object-oriented language, such as C++. When inheritance is supported, an ADT in a TDO may inherit properties of a previously defined ADT.
As illustrated for an Address ADT in FIG. 1C, each TDO has several fields which are described next. A version field in FIG. 1C identifies the version (e.g. 1.0) in which the TDO is created. A schema name in FIG. 1C identifies the user who created the type (e.g. as “user1”). A name of the type (i.e. the name of the ADT) is identified, e.g. as “address”. A version number in string form which is user readable is also included in the TDO, for displaying to humans. A type code which is the opcode for the ADT, indicates that the type is an integer, character, varray, nested table, or an ADT. The TDO also includes a TDS pointer which is the pointer to the corresponding TDS, such as Address TDS 162 (FIG. 1D) that can be used to pickle/unpickle the object. A flags field may indicate, for example, if inheritance is supported. The TDO has two additional pointers: one pointer is to a two-byte status of data for embedded attributes indicates whether or not the data is null and if null then the next pointer need not be used; another pointer is to a list of TDOs, and in this example the first element in the list is a pointer to the Street TDO, the next element is a pointer to the City TDO, the next element is a pointer to the Zip code TDO and the last element is a pointer to the State TDO.
One type of ADT is a User Picklable Type (UPT). A UPT is a collection of information that allows an application to store in an image form (e.g. binary), any data type that is not a natively-supported type. Examples of UPTs include non-final ADTs, nested tables or varrays. When an attribute of the ADT is a UPT, in one example, one of four special opcodes is used. The four opcodes indicate whether the attribute is a table to be pickled inline; a table to be pickled out-of-line; a varray to be pickled inline; or a varray to be pickled out-of-line. The opcode to be used for a UPT may be selected or declared by an application program at the time an ADT is declared. When an attribute of the ADT is complex, such as a UPT or a nested ADT that contains a UPT, a Collection opcode is stored in the metadata associated with the ADT. The Collection opcode indicates that a pickier should use collection images in writing the image as described in U.S. Pat. No. 6,128,621.
Note that Oracle 9i supports two flavors of ADTs: inline types and out-of-line types. Both of these are called user picklable types. Inlined types are created without the “NOT FINAL” clause meaning that they cannot be used to derive other types. Out-of-line ones are created using the “NOT FINAL” clause. Collection and nested tables are also user picklable types.
In response to “create type” command 112, a prior art computer 100 generates a Type Descriptor Segment “TDS” 162 for Person ADT as shown in FIG. 1D (note that another TDS 161 was previously generated in response to command 111 and a copy of it is included in TDS 162 as discussed next). As illustrated in FIG. 1D, a Type Descriptor Segment 161 for embedded ADT is provided contiguously within Type Descriptor Segment 162 just before index definition at the end of Type Descriptor Segment 162, with an offset 163 provided in a location 164 that is in-line within TDS 162, specifically at the same location where the attribute would have been present if the attribute were native to computer 100. Offset 163 of a prior art TDS has a value that identifies a location that is within current description 122, typically a location that occurs just before an index definition within current description 122. Such offsets that point to locations within the same TDS that contains the offset, are referred to herein as “internal” offsets. All offsets used in prior art known to the inventors are internal offsets, which was done to ensure that each TDS is self-contained, thereby allowing the TDS to be copied to another computer or saved to disk, without concern that pointers (if used) become invalid on doing so.
When programmer 101 defines additional ADTs such as an ADT 113 called “Employee” that uses person ADT 112, the TDS that is automatically generated by computer 100 for storage in the database ORACLE is shown in FIG. 1E. Although ADT 113 is illustrated in FIG. 1A as being specified by programmer 101 to have only one attribute (i.e. the person ADT), the programmer may easily specify additional attributes such as Employee Identifier and/or Employee Salary (both of which may be integers).
In the example of FIG. 1A, programmer 101 also issues a command 114 for an ADT called “Manager” that uses person ADT twice, once to hold information about an individual who is a manager himself (or herself) and another time to hold information about employees that report to this manager. A TDS shown in FIG. 1F is automatically generated, e.g. by a type manager in the database system, from ADT create type command 114 (shown in FIG. 1A). Moreover, programmer 101 may define further ADTs such as ADT 115 called “CEO” that uses person ADT 112 three times as follows: once to hold information about the CEO himself (or herself) and two times as noted above to hold information about managers that report to the CEO. The TDS that is automatically generated from ADT 115 is shown in FIG. 1G.
The inventors (of this current patent application) note that there is redundancy in a prior art TDS that is generated by the above-described prior art method(s), as follows. The description of “Address UPT” is repeatedly embedded at three different levels, as shown in italics in FIG. 1G. The inventors note that the size of a TDS of an ADT that contains embedded ADTs can be reduced, if multiple embedded TDSs that are redundantly present at the different levels are eliminated, as discussed next.