1. Field of the Invention
This invention relates to object oriented systems and data store systems, and more particularly to mapping between object schema and data store schema.
2. Description of the Related Art
The data processing industry and its customers have made considerable investments in conventional data store technology, including relational databases, hierarchial databases, flat file databases, and network databases. Presently, the relational or entity-relationship model underlying relational databases is the predominant conventional method of storing data in databases.
Object oriented technology has also gained wide acceptance due to its strengths in real world modeling, modularity, reuse, distributed computing, client/server computing, and graphical user interfaces.
However, the object model underlying object oriented technology and the data model underlying conventional data stores are different, and a way is needed to provide the advantages of object oriented technology while preserving the substantial investment in conventional data store technology.
An object model captures the structure of a system by representing the objects in the system, the relationships between those objects, and the attributes and operations that characterize each class of objects. The purpose of object modeling is to describe objects, and an object is simply something that has a meaningful behavior in an application context. An object has data, the value of which represent the object's state. The behavior that an object exhibits is provided by operations on that data, and this behavior may be invoked by other objects sending messages. These operations are implemented as procedures called methods. All objects have identity and are distinguishable. The term identity means that an object is distinguishable by its inherent existence and not by descriptive properties that it may have. A unique object may be referred to as an object instance or an instance.
An object class describes a group of objects with similar properties (attributes), common behavior (operations), common relationships to other objects, and common semantics. Objects in a class have the same attributes and behavior patterns. The objects derive their individuality from differences in the attribute values and relationship to other objects. The class defines the object's data structure and methods to access that data structure. Methods and data structure are shared among objects of the same class. An object knows its class and the methods it possesses as a member of the class. Common definitions such as class name and attribute names are stored once per class, rather than once per object instance. Operations may be written once for a class so that all objects in the class benefit from code reuse.
An attribute is a data value, not an object, held by the objects in a class. Each attribute has a value for each object instance. Different object instances may have the same or different values for a given attribute. Each attribute name is unique within a class, as opposed to being unique across all classes.
A link is a relationship between object instances, a tuple or ordered list of object instances. A link is also an instance of an association. An association is a group of links with common structure and common semantics. All the links in an association connect objects from the same classes. An association describes a set of potential links in the same way that a class describes a set of potential objects. Associations are inherently bidirectional and can be traversed in either direction. Associations are often implemented in various object oriented programming languages as pointers from one object to another. A pointer is an attribute in one object that contains an explicit reference to another object.
As an attribute is a property of objects in a class, a link attribute is a property of the links in an association. Each link attribute has a value for each link. Many-to-many associations are the rationale for link attributes.
Generalization and inheritance are powerful abstractions for sharing similarities among classes while preserving their differences. Generalization is the relationship between a class and one or more refined versions of it. The class being refined is called the superclass, and each refined version is called a subclass. Attributes and operations common to a group of subclasses are attached to the superclass and stared by each subclass. Each subclass is said to inherit the features of its superclass. Generalization and inheritance are transitive across an arbitrary number of levels. The terms ancestor and descendent refer to generalization of classes across multiple levels. An instance of a subclass is simultaneously an instance of all of its ancestor classes. The state of an instance includes a value for every attribute of every ancestor class. Any operation on any ancestor class can be applied to an instance.
Generalization and inheritance are fundamental concepts in object-oriented languages, and these concepts do not exist in conventional languages and databases. During conceptual modeling, generalization enables a developer to organize classes into a hierarchial structure based on their similarities and differences. During implementation, inheritance facilitates code reuse. Generalization refers to the relationship among classes; inheritance refers to the mechanism of obtaining attributes and operations using the generalization structure.
The object schema may be viewed as consisting of a set of object classes, wherein each object class consists of a set of object instances, wherein each object instance contains a set of attributes, and wherein object classes and object instances may be linked in relationships.
Instead of the above object model, conventional data store technology uses a data model in which a data model (also known as an information model or conceptual model) is designed by modeling the world that the application is to support. Then the data model is transformed into a particular database design by applying one or more standard transforms, for example, normalization which requires that data of an entity belong to that entity only.
The data models offered by conventional database technology include flat files, indexed file systems, network data model, hierarchial data model, and the relational model.
The flat file model provides a simple means of storing data in records which may be accessed according to the data therein; however, it provides no independence between the data and applications thus requiring the applications to be modified if the flat file design is changed. A flat file data store schema consists of records composed of fields.
Indexed file systems provide fixed-length records composed of data fields of various types, and indexes to more quickly locate records satisfying constraints on field values. An indexed file system data store schema consists of records composed of fields wherein certain fields may be keys (indexes).
A network data model provides fixed-length records composed of data fields of various types and indexes similar to the indexed file systems. In addition, the network data model provides record identifiers and link fields which may be used to connect records together for fast direct access. The network data model also uses pointer structures to encode a network structure or relationship of records. A network data store schema consists a set of network structures of records, wherein each record is composed of fields, wherein certain fields may be keys (indexes) and certain fields may be links to other records (link fields).
The hierarchial data model, similar to the network data model, provides fixed-length records composed of data fields of various types, indexes, record identifiers and link fields, and pointer structures. However, the hierarchial data model limits the structure used to represent the relationship of records to tree structures. A hierarchial data store schema consists of a set of tree structures of segments (each tree structure defined by a pointer structure known as a Program Communication Block or PCB), wherein each segment consists of fields, and wherein certain fields may be keys (indexes), and wherein certain segments may be links or pointers to other segments (pointer segment).
In the relational data model, the fundamental structure is the relation, which is a two-dimensional matrix consisting of columns and rows of data elements. A table is an instance of a relation in the relational data base. Each table has a name. A table must consist only of atomic fields of data, i.e., each field is a simple, indivisible type of data. A field is the basic unit of data representing one data fact.
Each column has a label and contains atomic values of the same data type, wherein each atomic value is an attribute drawn from a set of possible values that is that column's domain. The order of columns is not significant and may be changed without changing the meaning of a tuple. Each column may be referred to by a unique pairing of the table name with the column label.
Each table consists of zero or more tuples, which are rows of attribute values. Each row represents one relationship, and a row's identity is determined by its unique content, not by its location. The order of rows is not significant. There are no duplicate rows. The domain for every attribute must consist of atomic value; there are no repeating groups. If the value for a particular field is unknown or does not apply, then the relational mod assigns a null value.
Tables contain information about entities or the relationship between entities. Each tuple refers to a different entity, and each attribute value in the tuple supplies information about one characteristic of that entity. Each table must have a column or group of columns that serve to uniquely identify the tuple.
Each set of attributes that uniquely identifies each tuple is referred to as a candidate key. There may be multiple candidate keys in a relation, but one must be designated as the primary key. Foreign keys are used to define a link to another table. A foreign key is a key taken from another table to create a linking value to serve as a means of navigation from one table to the other table. A table may contain as many foreign keys as links it requires to relate it to other tables with which it has relationships.
The process of determining the correct location and function for each attribute to correctly formulate the relational schema is called normalization. Normalization decomposes incorrectly constructed relations into multiple correctly normalized relations.
The relational model requires that all tables must be in at least first normal form. To be in first normal form, a relation must have domains consisting only of atomic values for each attribute. Repeating sets of attributes and multi-valued attributes are not allowed. Optionally, the relational model may be subject to additional higher forms of normalization based on functional dependency, i.e., the reliance of an attribute or group of attributes on another attribute or group of attributes for its value.
The relational schema may be viewed as consisting of a set of tables, wherein each table consists of columns and rows, and wherein relations between table are specified by primary keys and foreign keys.
In view of the above differences between object oriented technology and data store technology, there is a need for a method of, and apparatus for, allowing a user to access a conventional data store from an object oriented application.
In view of the above differences between object schema and data store schema, there is a need for a method of, and apparatus for, allowing a user to map between conventional data store schema and object schema.
In view of the above, there is a need for a method of, and apparatus for, allowing a user to define a mapping between conventional data store schema and object schema.
In view of the above, there is a need for a method of, and apparatus for, allowing a user to represent such, a definition of a mapping between conventional data store schema and object schema.
In view of the above differences between conventional data store schema, there is a need for a data store independent method of, and apparatus for, meeting the above needs.
In view of the above, there is a need for an object oriented language independent method of, and apparatus for, meeting the above needs.
In view of the above, there is a need for a distributed client/server method of, and apparatus for, meeting the above needs.
In view of the above, there is a need for a method of, and apparatus for, providing a user an improved user interface to meet the above needs.