The extensible markup language (XML) has recently emerged as a powerful language for describing and communicating data. In particular, XML is an open, text-based markup language that provides structural and semantic information to the data. XML is a subset of the popular Standard Generalized Markup Language (SGML), and has become widely used within the Internet.
An XML document includes a root element and, possibly, a number of child elements. Each element consists of an opening tag and a closing tag. The elements of a document must be nested in the closing tag for the child. In this manner, an XML document follows a tree structure in which the elements have parent-child relationships. For example, the following pseudocode illustrates the format of a conventional XML document:
<ROOT>  <CHILD A>    <CHILD A1> DATA </CHILD A1>  </CHILD A>  <CHILD B> DATA </CHILD B></ROOT>
An XML schema is used to define and describe a class of XML documents. More specifically, an XML schema uses schema components to define the meaning, usage and relationships of the elements that may be used within a class of XML documents, as well as permissible content, attributes, and values for the elements. The World Wide Web Consortium (W3C) XML Schema Definition Language, for example, is an XML language for describing and constraining the content of XML documents. Other example schema definition languages include Document Content Description for XML (DCD), Schema for Object-Oriented (SOX), Document Definition Markup Language (DDML), also referred to as XSchema, Regular Language description for XML Core (RELAX), Tree Regular Expressions for XML (TREX), Schematron (SCH), and Examplotron (EG).
The following pseudocode illustrates the format of a conventional XML schema:
</XSD:SCHEMA>  <XSD:ELEMENT NAME=“USED-CAR” MINOCCURS=“0”  MAXOCCURS=“UNBOUNDED”>    <XSD:ELEMENT NAME=“MODEL” TYPE=“XSD:STRING”    USE=“REQUIRED”/>    <XSD:ELEMENT NAME=“YEAR” USE=“REQUIRED”>        <XSD:ATTRIBUTE NAME=“VALUE”        TYPE=“XSD:INTEGER”/>      </XSD:ELEMENT>  </XSD:ELEMENT></XSD:SCHEMA>The above pseudocode illustrates some of the basic concepts supported by schema languages. For example, various elements can be defined, such as the elements USED-CAR, MODEL and YEAR defined above. In addition, basic constraints for the elements can be defined, such as whether the element or an attribute of the element is required, and a range for the number of occurrences for the element.
However, making use of schema languages to constrain the structure and content of the XML documents can lead to very complex schemas having a specific definition for each permissible element. For example, schema languages tend to require definition of specific elements within the schema in order to define constraints on the elements. This approach tends to cause the compliant XML documents to lose normalization. In other words, this approach can result in XML documents in which the names and attributes for the elements are significantly different.