A markup language represents a hierarchical structure for a document by identifying markup on “tags”. For example, the HyperText Markup Language (HTML) focuses on expressing the presentation of a document (i.e. its formatting and displaying), for example sections, subsections, sidebars and images by attaching corresponding tags to individual words, paragraphs etc. Markup languages allow documents to be exchanged among applications. For example, HTML documents are widely used in the request-response protocol of what is called the World Wide Web (i.e. the Internet with the HyperText Transfer Protocol (HTTP) protocol suite). Normally, when a HTTP server in the Web receives a HTTP request, it sends a HTTP response with an embedded HTML document to the HTTP client (“browser”) from which the request originates. The browser processes the HTML document and, thus, displays it on a graphical user interface according to the presentation information contained in the document.
HTML has proven quite limiting because it focuses on the presentation of documents rather than representing their internal structure, and has a fixed set of tags. Extensible markup languages can solve this problem of limited flexibility of HTML. Under the auspices of a consortium that creates de facto standards for the Web, the World Wide Web Consortium (W3C) a new markup language called XML was defined in 1996 (see: Extensible Markup Language (XML) 1.0 W3C Recommendation Feb. 10, 1998”, http://www.w3.org/TR/REC-xml). With XML, a user can define his own set of tags. The definition of these tags can be contained in a Document Type Definition, or DTD. One of the validity constraints specified in the XML 1.0 Recommendation is that all tags of a document are defined in the DTD which is referenced in an XML document.
In the beginning, XML was thought of mainly as the language for metacontent. Metacontent is information about a document's contents, such as its title, author, revision history, keywords, and so on. Metacontent can be used, for example, for searching, information filtering and document management. Another interesting application relates to databases. If data is delivered as an document that preserves the original information, such as column names and data types, as is the case with XML, it can be used for other purposes than just displaying on the screen, for example, to do some computation. A further application area of XML is messaging, i.e. the exchange of messages between organizations (B2B messaging) or between applications systems within an organization. These applications are, for example, described in H. Maruyama et al.: XML and Java, 1999, pages 13 to 30.
It has also been proposed that the information contained in an XML document is used as processing instructions for certain procedures. For instance, Maruyama et al. gives an example of how XML can be used to interface with a database:
A SQL (Standard Query Language) query is embedded in an XML document. When, the XML document is parsed by an XML processor, a DOM (Document Object Model) representation of the XML document is generated in the memory of the processing computer system. DOM is an application programming interface (API) for XML documents (see J. Robie (Ed.): What is the Document Object Model?, REC-DOM-Level-1-1981001, http://www.w3.org/TR/REC-DOM-Level-1/introduction.html and M. Champion et al. (Eds.): Document Object Model (Core) Level 1, REC-DOM-Level-1-19981001, http://www.w3.org/TR/REC-DOM-Level-1/level-one-core. html). When using DOM, the tag structure of an XML document is converted in a tree-like memory structure, called a “DOM tree”. After the parsing of the XML document has been completed and the corresponding DOM tree is generated in the memory, the DOM tree is processed by visiting all nodes of the tree. When the node with the SQL query is visited, it is executed, i.e. the database is accessed (H. Maruyama et al., pages 97-141, 185-228).
Another similar example of how XML documents are processed is the JavaServer Pages Technology by Sun Microsystems, Inc. (see E. Pelegrí-Llopart et al.: JavaServer Pages Specification, version 1.1, Nov. 30, 1999). (JavaServer Pages is a trademark of Sun Microsystems, Inc.). The JavaServer Pages (JSP) technology enables the authoring of Web pages that create dynamic content. A JSP page is a text-based document that describes how to process a request to create a response. JSP pages are compiled to what is called servlets and respond to HTTP requests. An input received from HTTP POST or QUERY arguments can be in the form of an XML document. Such an input is first parsed by an XML parser, and a corresponding DOM tree is generated in memory. Then, JavaBeans components, Enterprise JavaBeans components, or custom actions can be invoked (see Pelegrí-Llopart et al., page 31). This corresponds in principle to the method for interfacing databases proposed by Maruyama et al.