1. The Field of the Invention
The present invention relates to methods and systems utilizing eXtensible Markup Language (XML) documents and services and, more particularly, to methods and systems for determining type equivalence between different XML schemas.
2. Background and Relevant Art
XML schemas are well-known in the computing industry. They can be used, for example, to define XML documents in a structured format so that corresponding XML data can be shared and accessed by generic applications. XML schemas can be processed by various programs to generate code for accessing the XML data. For example, a program can analyze a schema and create classes that can be used for extracting and utilizing the XML data, as defined by the schema types. In other words, an XML schema can be thought of as input and the resulting code and classes that are used to extract and utilize the XML data can be thought of as the output.
With specific regard to the Internet, many applications expose multiple Web services that share a subset of schema types as defined in the services' Web Service Description Language (WSDL) contract. A consumer of such services will want to share the equivalent types between Web service proxies generated from the WSDL. To solve this problem one must solve the problem of determining equivalence of types, as defined in the consumed Web services' contract. This, however, is easier said than done because XML grammar for defining schema types includes the ability to define defaults at various scopes, certain insensitivities to order, and permissible annotations. These variations can thereby cause two different schema type definitions to yield equivalent schema types. In addition, XML schema documents are typically serialized to XML 1.0 which introduces yet another set of variations due to the XML 1.0 serialization rules.
For example, FIG. 1 illustrates two examples 110 and 120 of equivalent schema type definitions for the same type, named Order. Initially, it will be noted that the syntax can be altered between equivalent schema types. For instance, example 110 includes a target namespace reflected in double quotations 130, while example 140 includes the same target namespace reflected in single quotations 140.
In the present examples 110 and 120, the indentation and spacing is also different. For instance the indentation of definition lines 150 and 152 is more uniform and pronounced than the indentation of corresponding definition lines 160 and 162.
The presentation order of the type definitions 170 and 172 from the first example 110 is also inconsistent with the presentation order of the corresponding type definitions 180 and 182 for the second example 120.
Finally, FIG. 1 also illustrates how various components of the schema types are discretionary and can be included or omitted, such as, for example, the components found in lines 190, 192, 194 and 196.
Accordingly, it has been shown how certain components of equivalent schema types can be presented differently. Because of this, it is important to identify the equivalent types so that only a single class is created for equivalent types within the shared schemas that are being consumed by the applications. Otherwise, the applications utilizing the redundant classes will become incompatible or fail to run properly. Similarly, it is important that different schema types are not identified by the same name, or else they will also fail to run properly.
For example, consider a situation in which there are two different schemas, a payroll schema and a human resource schema. In this example, the schemas include employee types that are equivalent, but not identical, and that will ultimately be used to create corresponding employee classes for accessing employee data. However, because the employee schema types are not identical, two different classes will be created, instead of only one. This creation of duplicate classes not only represents wasted resources, it can also cause some programs to fail, depending on how the data is being accessed and used.
These problems can become even more pronounced when considering that the schemas can change over time as customers customize their programs with new type definitions to accommodate new functionality and when applications are configured to consume or utilize additional schemas. Furthermore, when considering that the W3C permits the creation of custom schema types, it is also apparent that there is room for a large quantity of equivalent schema types to be created. In particular, although there are certain defaults for creating custom types, there is also a lot of flexibility for creating and defining equivalent types differently.
For at least the foregoing reasons, it should be apparent that it would be desirable to determine which schema types are equivalent prior to creating the classes from the schemas, so as to avoid creating multiple classes for equivalent schema types. Unfortunately, equivalence cannot be determined by merely looking at the names or definitions of the schema types because of the many different ways equivalent schema types can be represented.
Accordingly, it is currently necessary for a customer having problems resulting from the creation of multiple classes for equivalent schema types to edit the code created from the XML schema(s) so that only one type of definition exists and to delete the redundant secondary class(es). This, however, is cumbersome and is analogous to putting out a fire only after you have been burned.
Accordingly, there is currently a need in the art for techniques to determine equivalence between schema types and to prevent the fire from even starting.