Modern computer installations generate, manipulate, and store enormous quantities of data. Data base management systems have emerged as an indispensible component of such installations, serving the purpose of promoting efficient data storage and program design, enhancing file maintenance and modification, and eliminating data redundancy. The typical data base management system (DBMS) includes programs which interface with designers and users, accept and understand models or tables for subsequent use in organizing data, organize data according to the models or tables, store and retrieve the data in the actual data base contained in a computer storage subsystem. perform queries on the data, and generate reports based on the stored data.
A DBMS may be designed to store data according to any of a variety of data models, where the data model is the basic organizational concept for the underlying data base. These models, or schemes, for data base organization can be divided into several different classes, including hierarchical, network, relational, and entity-relationship. A detailed discussion of these types of data-bases may be found in "The Database Book," by Mary E. S. Loomis, Macmillan Publishing Company New York, N.Y. 10022 (1987). The present invention is applicable to all of the above data-base schemas. The preferred embodiment is described with particular reference to the entity-relationship modelling methodology provided by the Repository Manager/MVS product, which uses the DB2 relational DBMS as a back end to manage the storage and retrieval of data on the computer hardware. Entity-relationship databases are discussed in "The Entity- Relationship Model--Towards a Unified View of Data," by Peter Chen, ACM Trans. on Data Base Systems, Vol. 1 (1976). Relational data-bases and DB2 in particular is discussed in "IBM Database 2: General Information," IBM Publication GC26-4373 (1990). The Repository Manager/MVS product is discussed in "Repository Manager: Concepts and Facilities," IBM Publication SR21-3608 (1990).
A significant problem in maintaining any data base whose data entries represent objects, events, people, or relationships in the real world, is that although those things may change over time, the typical DBMS maintains only a single version of any given entry, making it impossible to concurrently represent a thing in its past, present, and future states. A second significant problem, which arises in maintaining a data base which is shared among a plurality of users, involves the toleration of concurrent but independent work on the same data entries by different users without sacrificing the semantic consistency of the data. Yet a third problem in maintaining a data base involves maintaining a record of the state of the data base itself as it existed at given times in the past. Such information is often needed for error recovery and for audit-trail purposes. Typical solutions to this problem involve taking "snapshots" of the data base and logging change activity, so that if necessary the data base can be "reconstructed" as it existed at some point in the past. This reconstruction is usually a time-consuming batch procedure, and a system so constructed cannot allow the past and current data bases to be accessed concurrently.
A solution to all of these problems is to maintain versions of the data entries. These versions may correspond to the different states of the real-world things represented, or to work in progress by different users. Such an approach is called versioning, and in general requires that the DBMS control the creating of the versions and all access to them, both to assure the semantic consistency of the data in all its versions, and to free users from the need to deal with the additional complexity that such a versioning scheme requires. Users of non-versioned data base systems sometimes simulate versioning by giving qualified names to the data base entries. However, this approach is undesirable, because it conceals from the DBMS the true identity of the things represented, and makes it impossible for the DBMS to verify the semantic consistency of the data base.
The generally preferred approach to implementing versioning is to provide direct versioning of entries in the DBMS, with the versioning managed by the DBMS to preserve the semantic validity of the data in the system. Such a system provides both parallel and serial versioning, with the capability for the user to define a hierarchy of versions, and to direct the DBMS to move versions of data from one hierarchy level to another. It also provides historical versioning of the database, allowing the user to view the data as it existed at any arbitrarily-selected time in the past. It provides a simplified programming interface that allows a user tool to interact with the data as though it were not versioned, the specification of which version is seen being made outside the program.
In a versioned-data management system (VDMS) environment, all the changes associated with a given user task are tightly coupled in the sense that subsequent access applies either to all of the changes or none of them. A consequence of this tight coupling is that all the changes associated with a given user task must be promoted at the same time. However, these changes typically involve more than one part in the VDMS and may have been performed by more than one tool. Since multiple tools may have been involved, having each tool maintain a list of what parts were changed would not be sufficient. The user would be forced to remember all the tools involved and then combine the lists. In addition, each tool may use a different means to maintain its list, thus complicating the job of combining the lists. To complicate matters further, a change to one part may cause the VDMS to trigger a change to other parts. Such a situation can occur in an entity-relationship VDMS, where the delete of an entity results in all relationships that pointed to the entity also being deleted. In this case, expecting the tool to maintain a change list is untenable, because the tool cannot be aware that other parts were also changed and hence would fail to include them on its list of changed parts. Yet another complication arises where a user makes changes to parts, promotes them, and then makes a subsequent change. The affected parts will show up more than once in the promote group, once at the higher variant level as the result of the promote of the part to the higher variant level, and once more at the leaf variant level as a result of the subsequent change at the leaf variant level. A tool maintaining the list of changed parts would have to be aware that multiple versions of a part exist at the same time. Tools known in the art are not aware that multiple versions exist. Providing the needed changes to these tools to be aware of multiple versions is costly and not desirable.
Heretofore, the known method for solving the above problems has been to design tools that are aware not only of the parts themselves, but also of the different versions of those parts. However, this solution is itself undesirable because of the complex design requirements it places on every tool provided for the VDMS.