1. Field of the Invention
The present invention relates generally to object-oriented computing and relational data store systems and, more specifically, to mapping between object schema and relational schema in which legacy or preexisting relational data has been subclassed.
2. Description of the Related Art
Businesses commonly need to store and access large quantities of data relating to specific business matters, such as their financial accounts, inventory, customers, employees, and other matters. Businesses use computers, of course, to aid this task. Businesses have invested billions of dollars in computer systems that store and access such business data. To minimize losses on this investment in computer systems, an important consideration in introducing new computer technology is adapting it to interface with existing computer technology.
A database is a structure in which a computer system may store a large quantity of data organized in a manner that facilitates efficient storage, search and retrieval. Physically, at the heart of any database is some suitable type of data store, such as magnetic disks, on which data may be recorded. Nevertheless, computer scientists and other researchers have developed a number of different conceptual models under which databases may be constructed.
The most prevalent database model is known as a relational database. In a relational database the data are organized in tables, also referred to as relations. Each data element in a table is indexed by its row and column in the table. Each row, also known as a tuple, represents an entity that is useful or meaningful to the business or other database user, and each column in that row refers to a data element that defines a characteristic or attribute of that entity. For example, each row in a company's database of its employees may refer to a certain employee. One column may refer to an employee's name, another column to an employee's identification number, and another column to an employee's address. Certain columns may be designated as "keys" to uniquely identify each row. For example, the column referring to an employee's name may be defined as a key. Keys may include primary keys, which are used as the primary means to access the rows, and foreign keys, which are used to define links between tables. The programmer who creates the database has considerable latitude in specifying the rows, columns, keys, and other characteristics that define the schema of a relational database.
The above-described data model underlying relational databases was developed to facilitate the storage and retrieval of data under the control of programming languages of the type that were prevalent at the time, which were primarily those known as procedural or structured programming languages. Because procedural programming languages and relational databases were for many years being developed and improved upon contemporaneously with one another, procedural languages are, not surprisingly, well-suited to manipulating relational database data. For example, a feature of most procedural programming languages allows a programmer to access an element of a table by specifying its row and column. Although a program would not necessarily access a database element using that feature of the programming language, the point to note is that relational schema and procedural programming share common concepts and programming philosophies.
Another type of programming, known as object-oriented programming (OOP), is becoming increasingly popular and may eventually supplant procedural programming. A potential problem, however, is that OOP languages do not inherently interface smoothly with relational databases. For example, the concept of indexing a table of data elements by row and column is in itself somewhat at odds with the OOP philosophy of handling an object in accordance with what it represents rather than how it is represented in a rigid data structure.
The goal of OOP is to reduce the time and costs associated with developing complex software by creating small, reusable sections of program code that can be quickly and easily combined and re-used to create new programs. The code sections are known as objects. OOP languages, such as Smalltalk, C++, and Java, have been developed that allow programmers to approach their programming tasks in a way that is believed to be more natural and intuitive than that in which programmers traditionally approached tasks armed with only the tools of procedural programming languages. Using the unique tools or features of an OOP language, which are described below in further detail, a programmer can write code to define a software object that models something in the real world. The software object may model the attributes or characteristics of the real-world object and, in many cases, may also model its behavior. For example, a programmer whose task it is to create an employee database program can create an object that models an employee. An employee object may have certain attributes of a real employee, such as a name, an address, an employee number, and so forth. Exploiting the full capabilities of OOP, a programmer could use the employee object in a program in a manner that roughly corresponds to the way one would interact with a real employee. For example, the programmer could define the employee object to provide its address when the object is asked for that information or to provide its status, such as "on vacation," when asked for status information. It should be noted that accessing an element of a table by specifying a row and column is a concept foreign to object-oriented programmers and not in keeping with the OOP philosophy of modeling things in the real world in a natural, intuitive manner.
Object-oriented databases (OODBs) that are specifically designed to facilitate storage and retrieval of objects have been developed. Objects that are stored in a data store are known as persistent objects because they "persist" after the program that created them ceases executing.
Despite the recent development of dedicated OODBs, businesses have invested billions of dollars over the years in their existing or legacy relational databases. It would be an extraordinarily uneconomical task to transfer all legacy relational data into OODBs. Furthermore, relational databases are continuing to be developed and improved and remain widely commercially available. Therefore, software has been developed that interfaces object-oriented software to relational databases. Such software typically includes a development tool, sometimes referred to as a schema mapper, that allows a database programmer to map the object schema to the relational schema. The software also typically includes a call-level interface. The call-level interface acts as a translator between an object-oriented application program and a relational database. Thus, although the objects are ultimately stored in relational format, the storage format is transparent to the application program, which may access them in the same manner as it would a persistent object in a dedicated OODB. An example of such software is described in U.S. Pat. No. 5,627,979, titled "A SYSTEM AND METHOD FOR PROVIDING A GRAPHICAL USER INTERFACE FOR MAPPING AND ACCESSING OBJECTS IN DATA STORES," (IBM Docket ST9-94-017) incorporated herein by reference, and its related U.S. patent application Ser. No. 08/276,382, filed Jul. 18, 1994, titled "A SYSTEM AND METHOD FOR MAPPING AND ACCESSING OBJECTS IN DATA STORES" (IBM Docket ST9-94-016).
The present invention addresses the problems involved in mapping between an object-oriented schema and a legacy relational schema that includes a "tiebreaker" column. It is known that a relational database table may include a tiebreaker column, also known as a type column. A tiebreaker column is used in relational databases as a switch to select a legacy subclass or meaning for one or more other columns from among two or more possible legacy subclasses or meanings. For example, a tiebreaker column may have been included in a legacy database to select whether the data in one or more columns relating to an employee related to an active employee or a retired employee. The character "A" in the tiebreaker column may have been used to indicate an active employee, and the character "R" in the tiebreaker column may have been used to indicate a retired employee.
It would be desirable to provide a schema mapping method and system that allows a programmer to use a tiebreaker column of a legacy relational database in an object-oriented application program. Furthermore, it would be desirable for the method and system to be sufficiently flexible to map different types of persistent objects. These problems are satisfied by the present invention in the manner described below.