1. Field of the Invention
The present invention is related to the storage of data within database systems. More particularly, the present invention is directed to the storage and access of object-oriented entities within a relational database management system.
2. Background
Many computer programming languages and applications utilize object-oriented structures to model real world information. Object-oriented languages and applications access and store data in the form of entities such as objects and attributes. For example, many conventional applications used for querying and maintaining directory information systems are modeled using aspects of object-oriented techniques and entities. Directory information systems provide a framework for the storage and retrieval of information that are used to identify and locate the details of individuals and organizations, such as telephone numbers, postal addresses, and email addresses.
One common type of object-oriented based directory systems is a directory based on the Lightweight Directory Access Protocol (xe2x80x9cLDAPxe2x80x9d). LDAP is a directory protocol that was developed at the University of Michigan, originally as a front end to access directory systems organized under the X.500 standard for open electronic directories (which was originally promulgated by the Comite Consultatif International de telephone et Telegraphe xe2x80x9cCCITTxe2x80x9d in 1988). Standalone LDAP server implementations are now commonly available to store and maintain directory information. Further details of the LDAP directory protocol can be located at the LDAP-devoted website maintained by the University of Michigan at http://www.umich.edu/xcx9cdirsvcs/ldap/, including the following documents (which are hereby incorporated by reference): RFC-1777 Lightweight Directory Access Protocol; RFC-1558 A String Representation of LDAP Search Filters; RFC-1778 The String Representation of Standard Attribute Syntaxes; RFC-1779 A String Representation of Distinguished Names; RFC-1798 Connectionless LDAP; RFC-1823 The LDAP Application Program Interface; and, RFC-1959 An LDAP URL Format.
LDAP directory systems are normally organized in a hierarchical structure having entries (i.e., objects) organized in the form of a tree, which is referred to as a directory information tree (xe2x80x9cDITxe2x80x9d). The DIT is often organized to reflect political, geographic, or organizational boundaries. A unique name or ID (which is commonly called a xe2x80x9cdistinguished namexe2x80x9d) identifies each LDAP entry in the DIT. An LDAP entry is a collection of one or more entry attributes. Each entry attribute has a xe2x80x9ctypexe2x80x9d and one or more xe2x80x9cvalues.xe2x80x9d Each entry belongs to one or more object classes. Entries that are members of the same object class share a common composition of possible entry attribute types.
Referring to FIG. 1, shown is an example of a hierarchical tree of directory entities. Entry 96 is the top most level of DIT 20 and is of object class xe2x80x9corganizationxe2x80x9d having an attribute type xe2x80x9cOrg. Namexe2x80x9d with an attribute value of xe2x80x9cOraclexe2x80x9d. Entry 96 is the xe2x80x9cparentxe2x80x9d entry for three xe2x80x9cchildxe2x80x9d entries (97, 98, and 99) directly beneath it in DIT 20. Entries 97, 98, and 99 are objects of object class xe2x80x9cDepartmentxe2x80x9d each having attributes xe2x80x9cDept. Namexe2x80x9d and xe2x80x9cState.xe2x80x9d Entry 97 has an attribute type xe2x80x9cDept. Namexe2x80x9d having a value of xe2x80x9cAdministrationxe2x80x9d and an attribute type xe2x80x9cStatexe2x80x9d with the value xe2x80x9cCAxe2x80x9d. Entry 98 has an attribute xe2x80x9cDept. Namexe2x80x9d with the value xe2x80x9cSalesxe2x80x9d and an attribute type xe2x80x9cStatexe2x80x9d with an attribute value xe2x80x9cNYxe2x80x9d. Entry 99 has an attribute type xe2x80x9cDept. Namexe2x80x9d with an attribute value xe2x80x9cRandDxe2x80x9d and an attribute type xe2x80x9cStatexe2x80x9d with a value of xe2x80x9cCAxe2x80x9d.
Entry 103 is a child entry of entry 97. Entry 103 represents an object of class xe2x80x9cPersonxe2x80x9d having the following attribute type-value pairs: (1) attribute type xe2x80x9cLast Namexe2x80x9d with a value of xe2x80x9cFounderxe2x80x9d; (2) attribute type xe2x80x9cFirst Namexe2x80x9d with a value of xe2x80x9cLarryxe2x80x9d; (3) attribute type xe2x80x9cTel. No.xe2x80x9d with a value of xe2x80x9c555-4444xe2x80x9d; and (4) attribute type xe2x80x9cStatexe2x80x9d with a value ofxe2x80x9cCAxe2x80x9d.
Entry 102 is a child entry of entry 98. Entry 102 represents an object of class xe2x80x9cPersonxe2x80x9d having the following attribute type-value pairs: (1) attribute type xe2x80x9cLast Namexe2x80x9d with a value of xe2x80x9cJonesxe2x80x9d; (2) attribute type xe2x80x9cFirst Namexe2x80x9d with a value of xe2x80x9cJoexe2x80x9d; (3) attribute type xe2x80x9cTel. No.xe2x80x9d with a value of xe2x80x9c555-3333xe2x80x9d; (4) attribute type xe2x80x9cManagerxe2x80x9d having the value of xe2x80x9cJim Smithxe2x80x9d; and (5) attribute type xe2x80x9cStatexe2x80x9d having the value xe2x80x9cCAxe2x80x9d. Note that entries 102 and 103 are both members of object class Person, but entry 102 has more listed object attributes than entry 103. In many object-oriented systems, objects that are members of the same object class may share a common set of possible object attributes, but some members of the class may not necessarily have values for some of the possible attributes. In this example, entry 103 does not have a value for attribute type xe2x80x9cManagerxe2x80x9d while entry 102 does have a value for this attribute.
Entries 100 and 101 are child entries of entry 99. Entries 100 and 101 are both members of object class xe2x80x9cPerson.xe2x80x9d Entry 100 is defined by the following attribute type-value pairs: (1) attribute type xe2x80x9cLast Namexe2x80x9d with a value ofxe2x80x9cDoexe2x80x9d; (2) attribute type xe2x80x9cFirst Namexe2x80x9d with a value of xe2x80x9cJohnxe2x80x9d; (3) attribute type xe2x80x9cTel. No.xe2x80x9d with a value of xe2x80x9c555-1111xe2x80x9d; (4) attribute type xe2x80x9cManagerxe2x80x9d having the value of xe2x80x9cLarry Founderxe2x80x9d; and (5) attribute type xe2x80x9cStatexe2x80x9d having the value xe2x80x9cCAxe2x80x9d. Entry 101 is defined by the following attribute type-value pairs: (1) attribute type xe2x80x9cLast Namexe2x80x9d with a value of xe2x80x9cSmithxe2x80x9d; (2) attribute type xe2x80x9cFirst Namexe2x80x9d with a value of xe2x80x9cJimxe2x80x9d; (3) attribute type xe2x80x9cTel. No.xe2x80x9d with a value of xe2x80x9c555-2222xe2x80x9d; and (4) attribute type xe2x80x9cManagerxe2x80x9d having the value of xe2x80x9cJohn Doexe2x80x9d; and (5) attribute type xe2x80x9cStatexe2x80x9d having the value xe2x80x9cNYxe2x80x9d.
One significant issue that has been faced by organizations seeking to develop an LDAP system is the selection of the type and configuration of a database system used to store the object-oriented LDAP data. A particularly desirable choice for many database configurations is to utilize a relational database management system (xe2x80x9cRDBMSxe2x80x9d). The relational database model provides many benefits when implementing a database application. For example, the relational database model has well-defined structures and entities (e.g., tables, views, indexes, etc.) to store or access the data of a database. RDBMS systems provide advanced database transaction, data consistency, recovery, and backup support. RDBMS systems also provide for clearly defined actions and operations to manipulate the data and structures of the database. Moreover, many RDBMS applications are designed to interoperate with standard database query languages (e.g., SQL) to access and modify data on the system.
The difficulty with implementing object-oriented applications, such as LDAP directory systems, in an RDBMS is that object-oriented data are based upon a fundamentally different data model than relational data. Object-oriented data are formed as entities which have specific object-oriented characteristics (e.g., objects and attributes). In contrast, the data in a relational database model are normally stored in database tables that are organized as an array of rows and columns. The values in the columns of a given row are typically associated with each other in some way. For example, a row may store a complete data record relating to a sales transaction, a person, or a project. Columns of the table define discrete portions of the rows that have the same general data format or data type. Thus, there are significant differences in structure between object-oriented data and relational data.
FIGS. 2A, 2B, and 2C depict one approach to storing object-oriented data, such as the entries from DIT 20 of FIG. 1, into an RDBMS. In this approach, a separate table is provided for each object class in the system. FIG. 2A shows an object class table 202 for the Organization class, which includes entry 96 from DIT 20 as a member of that class. FIG. 2B is an example of an object class table 204 for the object class Department, which includes entries 97, 98, and 99. FIG. 2C is an example of an object class table 206 for the object class Person, which includes entries 100, 101, 102, and 103 from DIT 20.
Each row of the object class table represents a single object of that corresponding object class. Thus, the Person class table 206 of FIG. 2C includes four rows, one row for each of the person class entries of DIT 20 (i.e., entries 100, 101, 102, and 103). Discrete columns within the object class table represent attributes of an object within the object class. A separate column must be provided for each possible attribute of an object class. The Person class table 206 of FIG. 2C includes five columns for object attributes xe2x80x9cLast Name,xe2x80x9d xe2x80x9cFirst Name,xe2x80x9d xe2x80x9cTel. No.,xe2x80x9d xe2x80x9cManager,xe2x80x9d and xe2x80x9cState.xe2x80x9d Similar rows and columns in FIGS. 2A and 2B describe the objects and attributes for the Department and Organization objects of DIT 20. Thus, the approach illustrated in FIGS. 2A, 2B, and 2C can be employed to represent object-oriented data in relational tables.
Referring to FIG. 2C, note that row 208 contains an empty space in the xe2x80x9cManagerxe2x80x9d column. This highlights one of the drawbacks of this approach. It is possible that some members of an object class may not have values for all possible attributes for that class. Entry 103 does not have a value for the xe2x80x9cManagerxe2x80x9d attribute, even though other members of the Person class 20 possess a value for that attribute type. The problem is that in the approach illustrated by FIGS. 2A-C, a column must be defined for each of the possible attributes of an object class. For each row in the table, resources may be set aside to allow values for all of the defined columns, even if some rows do not actually have values for one or more of the columns. Under this approach, system resources are wasted if any members of the class do not have a value for all defined attributes for the object class. This problem is further exasperated by very large object class tables having a large number of members that do not have values for particular columns.
Another drawback to this approach is that object class tables are not readily extensible, since the database schema itself has to be modified to allow changes to the definition of an object class. Such a change in definition occurs, for example, if an object attribute is being added or deleted from an object class. For example, consider when object class Person (represented by object class table 206 in FIG. 2C) is to be modified to include a new object attribute type called xe2x80x9cEmail Address.xe2x80x9d To implement this modification to the Person object class, the defining schema structure of the corresponding object class table must be modified to include a new column for the new attribute type. FIG. 3 depicts a revised Person class table 302 that includes a column for the new attribute type xe2x80x9cEmail Address.xe2x80x9d In operation, this modification typically involves the issuance of numerous data definition language (xe2x80x9cDDLxe2x80x9d) statements to modify the base schema of the database.
Generally, modifying the database schema is not a trivial task, and is performed only by administrators having specialized privileges to access and modify the metadata and structural definitions of the system. Moreover, adding columns to an existing relational database table could result in database fragmentation. This occurs because the data for the new column may not be co-located with the existing table data on a disk drive. Thus, performance suffers because two disk locations are accessed to access a single row from the database table. In addition, the method described with reference to FIGS. 2A-C suffers drawbacks when storing objects types that have multiple attribute values for an attribute type, since a single column is provided for an attribute type in an object class table.
Therefore, there is a need for an improved method and system for storing and maintaining object-oriented data in an RDBMS. In addition, there is a particular need for an improved system and method of storing and maintaining directory information objects, such as LDAP data, in an RDBMS.
A method and system for representing object-oriented data in a relational database is disclosed. An aspect of the invention is directed to the representation and storage of directory information objects, such as LDAP directory data, in a relational database system.
An aspect of the invention is directed to the generation of a database query language statement to query or manipulate directory information objects in a relational database. A feature of this aspect of the invention is the generation of a SQL statement for an LDAP search filter. Another aspect of the invention is directed to the hybrid use of Join operations with other types of aggregation operations in the generated SQL.
Further details of aspects, objects, and advantages of the invention are described below in the detailed description, drawings, and claims. Both the foregoing general description and the following detailed description are exemplary and explanatory in nature, and serve to explain the principles of the invention.