1. Technical Field
The present disclosure relates to XML and, more specifically, to storage of XML in a directory.
2. Description of the Related Art
Extensible Markup Language (XML) is a computer language for structured documents. Structured documents include documents that may contain content as well as descriptions and/or classifications of the content. For example, a structured document may be a book that includes a novel (the content) as well as a description of the novel (the title).
XML is a human-readable computer language. A human-readable computer language is a computer language where the source code may be viewed as standard text and a reader may be able to interpret the significance of the source code by examining the text. As a human-readable computer language, XML may be interpreted by a wide variety of computer platforms. This feature makes XML an excellent standard for data that is communicated between diverse programs, operating systems and computers.
XML documents may be made up of elements. Each element may be assigned any number of attributes. Attributes may have values. FIG. 1 is an example of an XML document. In this example, the XML document 11 may have an element called “Book” 12 with an attribute called “Title” 13 where the value of the Title attribute is “A Tale of Two Cities” 14.
XML content may be de scribed by using tags. Tags are descriptive captions that appear before and after content and may include attributes and values. Unlike some other computer languages where tags are predefined, such as hypertext markup language (HTML), XML tags may be user defined. XML can therefore use any number of tags to describe any form of content.
Each element in an XML document may be delineated by a start-tag 15 that is presented immediately before the element 12, and an end-tag 16 that is presented immediately after the element 12. XML documents may be hierarchical. Hierarchical XML documents may contain one or more child elements 17 and 18 within a parent element 12. An element is a child element if its tags are situated between the start-tag and end-tag of another element. An element is a parent element if the tags of another element are situated between the element's start-tag and end-tag. An element may be both a parent element and a child element as XML documents may have any number of hierarchical generations.
It is often desirable to store XML documents in a manner that allows for easy searching and retrieval of the XML document. The emergence of web services has increased the need to store a great number of XML documents.
Web services present a new way for computers to communicate with each other. Web services are software systems that can be identified by Universal Resource Identifiers (URI), analogous to identification of websites by Uniform Resource Locator (URL). Web services generally perform specialized functions or provide access to information. Web services generally contain public interfaces and bindings that enable other software systems (such as other web services) to seamlessly utilize its functionality. In this way, web services are transforming the way modern enterprises interact and share information.
Web services commonly communicate by exchanging data in the form of XML documents. Therefore as the popularity of web services increases, so does the desire to store and retrieve large numbers of XML documents.
One technique that has developed for the storage and retrieval of XML documents includes the use of directories. Directories are specialized databases that are primarily used for allowing a large number of users to quickly look up information. A directory is not intended to be primarily used as a tool for the organization and storage of data and is therefore optimized for information retrieval and not necessarily information storage. A directory service is a computer application that allows for access to a directory. While some directory services are local and only allow for use on a particular computer network, other directory services are global and allow for general access over a global computer network such as the internet.
Global directory services may spread information across multiple computer servers all of which cooperate to provide directory service. Such directory services are known as distributed directory services. The Internet Domain Name System (DNS) is an example of a globally distributed directory service. The DNS allows computers connected to the internet to look up the numeric internet address from the corresponding internet domain name.
LDAP, or the Lightweight Directory Access Protocol, is a protocol for quickly and easily accessing directory services. LDAP servers communicate using TCP/IP transfer services or similar transfer services making LDAP servers well suited for use over the internet or private company intranets.
An LDAP directory is made up of entries. Each entry may include attributes. Each attribute may have a value.
LDAP directories can be hierarchically arranged for more efficient searching. Hierarchical entries are commonly referred to as parent entries and child entries depending on their relationship to one another. For example, an entry representing a printer may be the child of an entry representing a computer in the case where the printer is connected to the computer.
Because both XML documents and directories may be hierarchically arranged, it is common to directly map an XML document into a directory. Mapping XML documents to directories may include creating a directory entry for each XML element and creating a directory entry attribute for every XML element attribute. This structural mapping is almost entirely 1-to-1 and may be relatively simple to perform. Some minor deviations from 1-to-1 structural mapping might be utilized. For example, XML elements that are unnamed might be assigned a numeric ID.
FIG. 2 is a directory entry 21 that has been mapped from the XML document shown in FIG. 1 using a 1-to-1 structural mapping. The Book element 12 from the XML document is mapped to a Book entry 22 within the directory. The Title attribute 13 belonging to the Book element 12 from the XML document is mapped to a Title attribute 23 belonging to the book entry 22 within the directory. The value 14 of the Title attribute 13 belonging to the Book element 12 from the XML document is mapped to a value 24 of the Title attribute 23 belonging to the book entry 22 within the directory. The child elements 17 and 18 of the Book element 12 from the XML document are mapped to child entries 25 and 26 of the Book entry 22 within the directory. Because the Book element 12 from the XML document is unnamed, the corresponding Book directory entry 22 may be assigned an arbitrary numeric ID 27, here 193 as an example, where necessary to conform to directory protocol.
This structural mapping allows the XML document to be stored to the directory while preserving the hierarchical structure. However, this method for storing XML documents to a directory has a number of drawbacks. For example, parsing the XML document so completely and storing such a large number of directory entries can be very time consuming. Additionally, searching for a desired XML document in a directory can be very time consuming when stored in this fashion and may require multiple directory queries that should be combined by a client application used to facilitate the query.
When a desired XML document is found within the directory, the document is retrieved entry by entry and converted back to XML elements and reassembled while maintaining the hierarchy to finally reproduce the original XML document. This process can be long and complicated and very time consuming.
Therefore, structural mapping of XML documents to directories can be a slow and inefficient process and may then lead to slow and inefficient searching of the XML document within the directory and a slow and inefficient restoring the directory back to the original XML document. It is therefore desirable to utilize a fast and efficient method for storing XML documents to directories that can be followed by a fast and efficient searching and restoring.