To ensure that data satisfy certain structural and non-structural constraints, it is common to use a schema, or data model, which provides a template for the data or document. One common representation for data is the Extensible Markup Language, or XML, which comprises a simplified subset of the Standardized Generalized Markup Language, or SGML. Unlike other subsets of SGML such as the Hypertext Markup Language (HTML), XML permits users to define new element labels and to nest XML elements within one another. Generally, schemas are used to constrain what labeled elements may appear in an XML document and how they may be arranged; an XML document conforms to a schema if the structure of the document satisfies the constraints specified by the schema. A schema for an XML document is built up out of type definitions. Together, the type definitions specify constraints on the structure of elements in an XML document such as, for example, the attributes that elements in the document may contain, the mandatory or optional nature of the elements, and the order in which the elements appear, and what other elements may be nested within an element.
One basic schema specification standard for XML is the DTD (Document Type Definition). In many XML applications, there is a DTD definition that specifies the XML format and one or more XML documents that conform to the DTD. Another common formalism for specifying the format of XML documents and data is the XML Schema. An XML Schema definition sets forth the layout format of documents that conform to the schema. This layout format includes which elements appear in each document and the data type for each element (such as whether it is numeric, binary, character, image, etc.). In addition, the XML Schema definition or DTD definition may include relational information that specifies how the various elements in conforming documents are related to each other. For example, for data that has a hierarchical structure, parent and child relationships will be described in the schema.
More generally, schemas may be any of a DTD, an XML Schema, or a string specification schema (such as a regular expression, a grammar or a finite state automaton), and documents may be either an XML document or a string.
Often documents or data objects that conform to a particular schema need to be verified as conforming with (i.e. recast into) another schema. For example, a business may have been saving and processing its customer records in accordance with a particular schema. However, the business may now desire to store its records in accordance with a new schema. In order to insure compatibility between its old and new records, the business may desire to recast the prior records into the new schema. As a further example, a program that processes documents typically expects to receive the documents in a particular format. If a business desires to process certain documents that are structured in accordance with a different schema with the program, it may be necessary to recast the documents into the appropriate schema.
Unfortunately, it is sometimes impossible to cast a particular document from one schema into another schema. For instance, the new schema may require a nonzero value for a particular element that is not present in the document in the first schema. Thus, in order to cast a document into a new schema, the document in the first schema must be valid in the second schema. The prior art method of validating a document in a schema is to examine each element that is going to be cast in the schema to determine if it is valid in the schema. Since businesses often have voluminous records, examining each individual element of each document in a particular schema to determine if it will be valid in a second schema can be a very time consuming process. Therefore, what is needed is an improved method of determining whether or not a document is valid with respect to a particular schema given that it conforms to another schema.