The present invention relates to complex data models, and more particularly to a method and device for semantic reconciling of complex data models.
In recent years, use of platform-independent and application-independent metadata has become more prevalent in digital computing. As known by those skilled in the art, metadata is a definition or description of data. Metadata provides a structure, or schema, for generating or validating data instances. Unlike traditional data specifications, metadata is expressed through the use of metalanguages such as the Standard Generalized Markup Language (SGML) or the Extensible Markup Language (XML), which permit a user to define lexical tags to describe a structure for data. Corresponding data instances may then employ these user-defined tags to describe content. Advantageously, a metadata schema transmitted with such data instances may be used with a generic compiler to validate or interpret the data instances. Accordingly, metadata can support effective sharing of data. As well, because metalanguages are ASCII-based, platform dependencies are minimized or eliminated.
Metadata schema and data instances are referred to as complex data models. Many complex data models suffer from a common problem, namely, the possibility of divergence or lack of coherence between versions. As data models are updated over time, copies of legacy models may remain for various reasons. The existence of multiple model versions may be attributable to a lack of version control, for example, or to confusion over which version is the most current. Alternatively, two or more developers may intentionally make distinct sets of changes to a data model in order to promote parallel development efficiencies.
Regardless of the cause of the divergence, in these situations one is faced with the task of reconciling two or more versions of a complex data model. Traditionally, reconciliation of divergent complex data models has involved a manipulation of the divergent versions in their source metalanguage form, i.e. in the complex data model domain, to effect a manual reconciliation of the differences. Thus, a reconciling individual (or xe2x80x9creconcilerxe2x80x9d) might use a standard text editor to edit divergent complex data model data files simultaneously. More specifically, the reconciler may perform a textual comparison of the versions and then manually merge them into a reconciled version of the model by cutting and pasting metalanguage fragments (i.e. entities or attributes) for example. Disadvantageously however, this process can be difficult, for a number of reasons.
First, because a reconciliation of this type is performed in the complex data model domain, in order to be able to effectively reconcile the versions, a reconciler must not only have a good understanding of the semantic domain, s/he must also be familiar with the low-level lexical and syntactic details of the associated complex data model. As a simple example, in the case where a person is responsible for reconciling two versions of a complex data model representing an instance of an integrated circuit design, the person would not only be required to be familiar with the microelectronic engineering principles governing the reconciliation (i.e. the semantic domain), but would also have to be familiar with the particular integrated circuit schema and lexical tags being used to express its design (i.e. the complex model domain). This requirement for expertise in both the semantic and complex data model domains complicates the training necessary for an individual to become a qualified reconciler and correspondingly reduces the number of persons whose skill set is sufficiently broad to perform model reconciliation. Moreover, errors may be introduced during reconciliation in the event that a reconciler""s knowledge of the complex data model is imperfect.
Second, because each complex data model version to be reconciled typically constitutes a complete copy of the model, the person responsible for reconciliation may be required to parse through virtually the entire model to make the requisite changes, even though much of the model may be irrelevant with respect to the particular reconciliation at hand. This can be a time consuming and tedious process, especially when the model is sizeable.
Third, because manual reconciliation of this type does not provide for the automatic enforcement of data abstractions or value dependencies which may exist in the complex data models to be reconciled, reconciliation may result in the introduction of errors into the complex data model. This is especially true in the case where the reconciler is unfamiliar with the model""s data abstractions or value dependencies.
Fourth, manual reconciliation tools are not easily customized to a particular reconciliation task. Some reconciliation tasks warrant reconciliation of divergent complex data models only with respect to a subset of their divergent aspects for which reconciliation has been deemed important. A manual reconciliation tool provides no mechanism for identifying a divergent aspect within a complex data model as being xe2x80x9cimportantxe2x80x9d (requiring reconciliation) or xe2x80x9cunimportantxe2x80x9d (not requiring reconciliation).
A number of alternative approaches and reconciliation tools have been developed. One type of tool, which is a variation of the traditional approach, operates by displaying the textual metalanguage of the versions to be reconciled side-by-side along with visual cues (such as colored text for example) accentuating the differences to be resolved. The visual cues tend to focus the reconciling individual on the reconciliation task at hand and may thereby expedite the reconciliation process. As well, this approach may involve some automatic syntax-checking of the complex data model to ensure that syntax errors are not introduced during reconciliation.
The described type of tool does not, of course, alleviate all of the above-noted reconciliation difficulties. Fundamentally, the reconciling individual is still required to work in the complex data model domain, complete with its intricate lexicon and syntax rules. Thus, it is still necessary to employ a reconciler who has a good understanding of both the complex data model and the associated semantic domains. Moreover, because such tools typically present the complex data model versions to the reconciler in their entirety rather than just the aspects to be reconciled, the reconciler may still be required to scan through much information that is superfluous to his/her specific reconciliation duty. This can be time consuming as well as prone to error. Additionally, because such tools typically do not support the automatic enforcement of any data abstractions or value dependencies existing in the complex data models, erroneous implementation may occur. This is especially true when data abstractions or value dependencies with which the reconciler is unfamiliar are present in the model. Finally, reconciliation efficiency may suffer due to the fact that such tools are not easily customized to a particular reconciliation task and because no mechanism is provided to distinguish divergent aspects requiring reconciliation from divergent aspects not requiring reconciliation.
Another known type of tool takes a more customized approach towards the reconciliation of complex data model versions. In this approach, the reconciliation tool is tailored exclusively to the complex data model and reconciliation task in question. The tool is capable of interpreting the lexicon, syntax, data abstractions and value dependencies of the complex data models to be reconciled and is programmed with sufficient information regarding the reconciliation task at hand to be capable of merging divergent aspects of the versions with little or no instruction from the reconciling individual. Such a tool typically has a custom user interface that is specific to the complex data model and reconciliation task being performed. Advantageously, divergent complex data models aspects are displayed semantically, allowing reconciliation to be performed in the semantic domain. Accordingly, the requirement for human parsing of a complex data model is reduced or eliminated. As well, because tools of this type are customized, they are capable of reconciling only certain xe2x80x9cimportantxe2x80x9d divergent aspects.
This second type of reconciliation tool is problematic, however, in one key aspect. Fundamentally, because the tool is customized exclusively to a particular type of complex data model to be reconciled as well as a particular reconciliation task to be performed, it has virtually no flexibility of application. In order to be used for a different type of complex data model or reconciliation task, a new tool must be designed, implemented and tested. This is a time-consuming, tedious and expensive process.
Hence what is needed is a method and device for semantic differencing and merging of complex data models which addresses at least some of the above named difficulties.
A method and device for semantically reconciling complex data models is disclosed. A first transform is initially applied to received divergent complex data models in order to extract fundamental data representing selected divergent aspects of the complex data models that are to be reconciled. The extracted fundamental data are then semantically displayed in a manner suitable for both identifying differences between the aspects to be reconciled and for reconciling them. Input representative of a reconciliation of the fundamental data by a reconciling individual is received, and the fundamental data are reconciled accordingly to generate a single reconciled fundamental data set. The reconciled fundamental data set is then expanded into a corresponding reconciled complex data model by application of a second transform. The transforms are optionally capable of providing automatic enforcement of complex data model data abstractions and value dependencies during reconciliation.