1. Field of the Invention
This invention relates to mapping of data to objects in an object-oriented environment.
2. Background Art
In a database management system (DBMS), data is stored in rows of tables. Each row contains one or more fields or columns. Each column contains an item of data. For example, an employee table contains rows of employee records. Each row, or record, contains information regarding an employee. An employee record can contain, for example, a last name column that contains a last name of the employee.
Data stored in a column of a table can form the basis of a relationship between another table in the database having a related column. A relationship can also be formed using more than one column per table. Using a relationship between columns of two tables, it is possible to join these two tables to provide a single table containing instances of rows from one table combined with related rows from the other table.
Data from two or more tables can also be joined using another capability provided in a DBMS known as a view. A view provides the ability to create a virtual table. That is, the table created using a view is not considered an actual table. Therefore, some DBMS operations, such as update, cannot be performed on a view.
Like a joined table, a view contains rows from one or more tables in the database. For example, a view can contain the rows from two tables in the database, an employee and department table. Such a view may include all or some subset of the total number of columns contained in each of these tables. For example, the employee table contains xe2x80x9cemployee identificationxe2x80x9d, xe2x80x9cdepartment identificationxe2x80x9d, xe2x80x9clast namexe2x80x9d, xe2x80x9cfirst namexe2x80x9d, xe2x80x9cstreet addressxe2x80x9d, xe2x80x9ccityxe2x80x9d, and xe2x80x9czip codexe2x80x9d columns. The department table contains xe2x80x9cdepartment identificationxe2x80x9d, xe2x80x9cdescriptionxe2x80x9d, xe2x80x9cnumber of employeesxe2x80x9d, and xe2x80x9cbudgetxe2x80x9d columns. All of the information contained in these two tables may not be pertinent or required to allow a user to be able to review employee information. For example, a department""s budget figures are not pertinent to such a system. A view can be used to define a virtual table comprised of the columns in the employee table and the employee""s department description from the department table. The xe2x80x9cdepartment identificationxe2x80x9d columns from the two tables can be used to join rows from the two tables to form the view.
Views are useful to simplify the database schema by creating subsets of the database for use with particular applications. Further, views can be used to provide security. In the above example, the exclusion of the xe2x80x9cbudgetxe2x80x9d column from the view limits accessibility or knowledge that such a column exists. Thus, a user is only made aware of the data that the user is authorized to access. One disadvantage of views is that they are read-only. Therefore, a view cannot be used to update the base tables that actually contain the data used to create a view.
Another disadvantage of views is that a DBMS restricts the operations that are required to create a view. That is, only someone with database administrator (DBA) privileges can create the virtual tables needed to map objects to the tables of the DBMS. Therefore, to develop an application including views, it is necessary to have someone with DBA privileges available throughout the development phase to make changes to existing views and create new views. Once an application that includes views is distributed to a user site, it is necessary to install the application at the user site. To install the application at the user site, someone with DBA privileges must create the views that are required by the application.
Applications are developed to provide a user with the ability to facilitate access and manipulation of the data contained in a DBMS. A DBMS includes a Data Manipulation Language (DML) such as Structured Query Language (SQL). A DML provides set-oriented relational operations for manipulating data in the DBMS. However, a DML requires a precise syntax that must be used to access and manipulate DBMS data. To use a DML, a user must understand and use the DML""s syntax. Instead of requiring each user that wishes to modify a DBMS"" data to learn the DML""s syntax, applications are written that provide an interface between the user and a DBMS"" DML.
Therefore, applications are developed that provide a user interface that allows a user to specify operations to be performed on DBMS data in a more user-friendly manner. These applications are written in a programming language such as C, objective C, and SmallTalk, for example. SQL, or another database programming language, is embedded in these general-purpose programming languages. Once a user identifies a data operation, the application uses embedded SQL statements to perform the operations on the DBMS data as directed by the user.
Some general-purpose programming languages, such as objective C and SmallTalk, are referred to as object-oriented programming languages. Object-oriented programming languages define data and the operations that can be performed on the data. Such a definition is referred to as an object. To use data stored in a DBMS in an application written in an object-oriented language, it is necessary to read data stored in the DBMS as columns within rows of a record into objects. Conversely, object data must be read from the object and stored in tables in the DBMS.
A mapping must be performed to determine what DBMS data is stored in what object, or conversely, what object data is stored in what DBMS tables. There are several disadvantages with the current object-oriented systems"" techniques for mapping DBMS data to objects. First, data-to-object mapping must be represented in the program code of an application. That is, an application developer must be aware of the DBMS structure or schema and how the schema is to be mapped onto the application""s objects to develop an application. Further, an application must include code to define the mapping. Therefore, the DBMS-to-object mapping is not transparent to the user (e.g., the application developer). Further, the program code needed to implement this mapping increases the size and complexity of the application. The increased coding results in an increase in the amount of the effort needed to debug and maintain the program code. Further, the DBMS-to-object mapping is not dynamic. When a change is made to the DBMS schema, the application must be re-coded to reflect the schema change.
Another disadvantage relates to the restrictions that are placed on the DBMS schema and/or DBMS-to-object mapping that can be supported by the current object-oriented systems. Using current systems, there must be a one-to-one correspondence between an object and a table in the DBMS. Therefore, the schema chosen for the DBMS data is restricted by the object definitions, or vice versa.
Further, because there must be a one-to-one correspondence between a table and object, it is not possible to map multiple tables to a single object. Thus, in the example described above, it is not possible to map the columns included in the virtual table (i.e., columns from the employee table plus the employee""s department description from the department table) to the properties of a single object.
The present invention creates a model that is used to transparently map object classes in an object-oriented environment to a data source. The model maps the relationship between properties of each object class and data of the data source. For example, the model provides a mapping of the relationship between properties of each object class and columns of DBMS tables. Other data sources that can be used with the present invention include a user interface, a file system, and object-oriented database, for example.
Prior to model generation, an application""s object classes and DBMS schema (when a DBMS is used as the data source) are designed. Each can be designed independent of the other since the model can be used to map one to the other. Thus, for example, a model can be used to map the object classes of an existing application to a new DBMS schema, or vice versa.
An object class definition includes properties and behavior. Properties are the data that is manipulated by the methods (behavior) of the object class. A DBMS schema specifies tables and the columns of the tables, for example. The DBMS schema specifies columns from the tables that can be used for join operations specified using a DBMS data manipulation language such as SQL.
A model is defined that maps the object classes to the DBMS schema. The mapping is performed transparently such that the object classes and DBMS schema are not necessarily aware of the mapping. For example, there is no need to implement a class to mirror or accommodate the data source""s structure. Similarly, there is no need to design a data source structure based on object classes.
The model is comprised of entities, attributes and relationships. An entity represents the primary structure of the model. An entity maps to an object class and to one or more tables of the DBMS. An entity contains attributes and relationships. An attribute can be simple or derived. A simple attribute maps to a column of the DBMS. A derived attribute does not directly map to a column of the DBMS. A derived attribute can be, for example, a combination of simple attributes operated upon using a mathematical operation. Simple and derived attributes map to properties of an object class.
Relationships can be defined in the model. A relationship creates a link between at least two entities of the model. A relationship can be used to flatten an attribute or flatten a relationship. A flattened attribute is an attribute of one entity that is added to another entity. A flattened relationship is created by the elimination of an intermediate relationship between two other entities. For example, a first relationship exists between a first and second entity. A second relationship exists between the second entity and a third entity. The first and third entities are related to each other by virtue of their relationship with the second entity. A flattened relationship can be created between the first and third entities by eliminating the first and second relationships.
A relationship creates a path that is traversed to resolve the relationship. Neither the object classes nor the data source need to be aware of the traversal path. The path is traversed as needed during model definition and at runtime. During model definition, the path is traversed to resolve relationships to flatten attributes and relationships. During runtime, the path is traversed to resolve relationships to instantiate objects and synchronize objects and the DBMS.
For example, during runtime, a relationship is used to identify a join operation that must be performed in the DBMS. The relationship and the entity definitions are used to generate an SQL statement that joins the necessary tables using the tables"" join columns. The result of the join is a virtual table (i.e., a subset of the tables involved in the join). Data can be extracted from virtual tables to instantiate objects and to update the actual table data using the relationship definitions defined in the model.
Relationships are unidirectional. A relationship""s direction is used to resolve the relationship. A unidirectional relationship has a single traversal path that has a source entity and a destination. Relationship keys from the source and destination entities (known as source key and the destination key, respectively) are used to traverse the path. The source entity and join criteria are used as the criteria for selecting records from the destination entity based on the source and destination attributes.
A pair of unidirectional relationships can be used to create a bi-directional relationship. A bi-directional relationship has two traversal paths. One path traverses from the source entity to the destination entity. A second path traverses from the destination entity to the source entity. A bi-directional relationship is created using an auxiliary entity.
A relationship is typically created between two different entities. However, a relationship can also be created using a single entity. This type of relationship is referred to as a reflexive relationship. A reflexive entity uses the same entity as the source entity and the destination entity. One attribute of the source entity is defined as the source key while another attribute of the source entity is defined as the destination key.
The model is used at runtime to instantiate instances of an object class. Modifications made to the data by a method of an object is then propagated to the data source using the mapping provided by the model. Thus, the model is used to synchronize the data contained in an object instance and the data source.