1. The Field of the Invention
The present invention relates to networking technologies, and more specifically to mechanisms for using a group identifier to represent mappings of each of a number of abbreviated namespace identifiers to a hierarchical namespace used to uniquely identify an element (such as an XML element) of hierarchical document (such as an XML document), and also more specifically to mechanisms for developing overlapping namespaces.
2. Background and Relevant Art
Computing systems have revolutionized the way people work and play. Never before have so many had so much access to so much information. All one needs to access digital information and communicate throughout the globe is as an Internet-enabled computing system such as a desktop computer, a laptop computer, a Personal Digital Assistant (PDA), a mobile telephone, or the like.
When two computing systems communicate, it is helpful if the two computing systems not only communicate the data itself, but also understand the underlying meaning of the data so that intelligent decisions can be made based on that data. For example, suppose that a computing system receives the number “84111”. It would be helpful for the computing system to determine what is being communicated, not just that the number was communicated. For example, it would be helpful to know whether the number is a residence street number, a postal zip code, a number of widgets ordered, a product serial number, or some other meaning. Appropriate action taken by the computing system depends on what the number represents.
In order to allow meaning of data, rather than just the data itself, to be communicated, a technology called “schemas” has been developed. Schemas define how a particular item of data is structured. The extensible Markup Language (XML) has been widely adopted as a language in which structured data may be represented. For example, consider the following example Extensible Markup Language element:
<Address><Street>34 West Ninth Street</Street><City>Ideaville</City><State>Kansas</State><Country>United States</Country><PostalCode>54321</PostalCode></Address>
The XML element is of a type “address” or in other words is an address XML element. The address element has a number of subelements including a street element, a city element, a state element, a country element, and a postal code element. Human readers have the intuition and experience to understanding that this element represents a physical United States postal address of 34 West Ninth Street, Ideaville, Kans., 54321.
Although obvious to human readers, computing systems do not have the same experience and intuitive reasoning ability as does the complex human brain. Accordingly, computing systems need some understanding of the structure of the XML element in order make decisions based on the understanding that the XML element indeed represents a United States physical postal address.
Schemas provide precisely that structural understanding. One technology that enables the defining of schemas is called the XML Schema Description (XSD) document. XSD documents are XML documents themselves and define elements, their associated subelements, and what the meaning of the elements and subelements and associated attributes are. XSD documents may also define how many times (zero or more) a subelement may appear at a particular location in an XML element. For example, a schema that defines a structure for contact information may have defined the address XML element provided above.
XML has been so widely adopted as a means for communicating structured data, that it is not unusual for different schemas to generate the same name for differently structured XML element. For example, consider the following XML element also of the type “address” and also being an address XML element:<Address>123.45.67.8</Address>
This address XML element has a much simpler structure. A human reader can clearly see that this address XML element does not represent a United States physical postal address at all. One of ordinary skill in the art would also likely be able to recognize the XML element as a network Internet Protocol (IP) address. A schema may also be used to define this address XML element.
With the widespread implementation of XML, one can easily envision that there could be a wide variety of other address XML elements that follow different structures. For example, there may be many different address XML elements of different structures that define a United States postal address. For example, there may be some XML elements that provide the street number as a separate field, instead of in the same name as the street. There may be some that list the country first before the street address. Also, there may be different address XML elements to represent different address format recognized throughout the globe. Also, there may be address XML elements that represent computer addresses or the like.
Accordingly, when reading an XML address element, it would be very difficult for a computing system to understand the structure of the XML element since there may be many different schema documents that define different and inconsistent structures of an address XML element. In order to allow computing systems to resolve this kind of ambiguity and thus uniquely identify the structure of an XML element, even when the number of XML element types having that same “address” type is numerous, a two part naming mechanism is in widespread use.
One part of the XML naming mechanism is the type of the element. For example, the type of the above XML elements is “address”. A second part of the XML naming mechanism is called a “hierarchical namespace”. Typically, this namespace is represented in the XML document in a similar manner as attribute of the element and may take the form of a Uniform Resource Identifier. Together, the element type and the namespace uniquely identify the XML element. The schema description document itself defines a corresponding namespace that is to be associated with the XML element. Accordingly, the computing system can uniquely identify and validate the structure of an XML element based on its type and namespace, even if there are numerous schemas that define XML elements having the same type, but with different namespaces.
As previously mentioned, namespaces may take the form of a URI. These URIs can include a long sequence of characters ranging from about 10 characters to 50 or more characters. An XML document may contain a number of different XML elements associated with a number of different namespaces. Such an XML document typically includes at least one express recitation of each of the URIs corresponding to the different namespaces, even though a namespace declared on one element is inherited unless expressly overwritten in its subelements. The express recitation of these namespace URIs can significantly increase the size of an XML document, especially when the XML document includes elements associated with different namespaces.
One conventional method for reducing the size of such XML documents is to provide an association between an abbreviated namespace identifier and a corresponding namespace URI with the namespace is first declared for an element. This association is often called herein a “namespace declaration.” Should the namespace URI be required to be associated with another XML element, the abbreviated namespace identifier is used instead of the full namespace URI in order to provide the association between the namespace and the XML element. The following XML document is an example of an XML element in the form of a Simple Object Access Protocol (SOAP) envelope in which line numbers are added for purposes of clarity. Note the use of abbreviated namespace identifiers:
 1.<S:Envelope xmlns:S=“soap uri”> 2.<S:Header> 3.<p:policy xmlns:p=“policy uri”> . . . </p:policy> 4.<s:security xmlns:s=“security uri”> . . . </s:security> 5.<t:timestamp xmlns:t=“timestamp uri”> . . . </t:timestamp> 6.<q:session xmlns:q=“session uri”> . . . </q:session> 7.<r:reliability xmlns:r=“reliability uri”> . . . </r:reliability> 8.</S:Header> 9.<S:Body>10.<x:myElement xmlns:x=“x uri”>11.. . .12.</x:myElement>13.</S:Body>14.</S:Envelope>
Line 1 identifies the XML element from line 1 to line 14 as a SOAP envelope. The text “xmlns:S=‘soap uri’” is called a namespace declaration in which the abbreviated namespace identifier “S” is associated with the “soap uri”. Note that a lengthy namespace URI would replace the term “soap uri” in line 1. Similarly, abbreviated namespace identifiers are included for “p”, “s”, “t”, “q”, “r” and “x” in corresponding lines 3, 4, 5, 6, 7 between a namespace URI and an XML element. Although the use of abbreviated namespace identifiers does not reduce the size of the above-listed SOAP envelope example, should the SOAP envelope have provided further XML elements that followed one of the declared namespaces, the namespace abbreviator could have been used, rather than recited the entire namespace URI.
Even though abbreviated namespace identifiers are used, XML document drafters may elect to recite the entire namespace URI instead of just the abbreviator. For example, if the XML element were to be signed, one might want to redeclare the namesapce using the full namespace URI rather than using the abbreviator as a namespace prefix in order to ensure that the namespace association survives through any subsequent signing operation.
Even when abbreviated namespace identifiers are used, the express namespace URI is included at least once, if not many more times, throughout the XML document. Accordingly, what is desired are methods, systems, computer program products and data structures that reduce the size of a hierarchical document such as an XML document while preserving namespace associations.
Also, in conventional schema description documents, each defined XML element schema is assigned to only one namespace. This restricts the configuration of namespaces and does not allow for efficient overlapping or nesting of namespaces in the manner described in the below-included detailed description of the preferred embodiments.