Networks and networked applications have grown dramatically in number, size and complexity over the past decade. While the Internet is the most prominent example, internal LAN's (Intranets) and distributed computing are also part of this growth. By definition, all networked applications need to send and receive information over a network, often communicating with other applications. The great variety of formats in existence makes integration of applications and data sources a difficult and expensive problem. Current data encoding standards are constantly replaced by newer technologies, further complicating the problem of providing connectivity between network nodes. From bit-encodings of low-level network transport protocols to HTML and XML, the problem of data and protocol translation is a complex and difficult one, because of the need to provide both high flexibility and high performance.
One of the more recent data encoding formats enjoying wide adoption, especially on the Internet, has been XML, a part of the SGML family of document description languages.
The proliferation of interconnected sites or domains known as the World Wide Web (“Web”) was initially developed largely using the document description language known as HyperText Markup Language (HTML). HTML was used predominantly to specify the structure of Web documents or “pages” using logical terms. HTML, however, has inherent limitations for Web document design, primarily resulting from the limited, predefined tags available in HTML to define elements in a document or page. Nonetheless, HTML-defined documents continue to exist in significant quantities on the Web.
EXtensible Markup Language (XML) was developed as a document format protocol or language for the Web that is more flexible than HTML. XML allows tags used to define elements of a page or document to, be flexibly defined by the developer of the page. Thus Web pages can be designed to effectively function like database records with selectively defined tags representing data items for specific applications (e.g. product code, department, price in the context of a purchase order or invoice document or page).
In the world of Web content, the use of XML is growing as it becomes the preferred data format in both business-to-business (B2B) and business-to-consumer (B2C) Web commerce sectors (e-business). The tremendous and continuing growth of XML in B2B applications has led to a great number of different XML e-business vocabularies and schemas. There are standardization efforts driven by industry associations, consortia, governments, academia and even the United Nations. Merely storing or transmitting e-business data “in XML” is not a guarantee of interoperability between e-business commercial entities or sites. Even the method of specifying a particular structure for an XML document has not been agreed upon, with several incompatible methods in wide use. It is therefore necessary to perform conversions between different XML formats to achieve server-to-server transfer of invoices, purchase orders and other business data in the e-business context. The problem of interoperability is exacerbated by the commingling of XML and HTML e-business sites on the Web.
Successful B2B and B2C sites are being called upon to support a greater variety of clients and client protocols. That is, sites must be accessible by different browsers running on clients, e.g. Netscape or Internet Explorer, and by different versions of these (and other) browsers. Additionally, the nature of clients and client protocols is changing and adding to the problem of interoperability. Different clients, in the form of Personal Digital Assistants (PDAs) and WAP (Wireless Application Protocol) enabled cellular phones, process XML content but need to convert it to different versions of HTML and WAP to ensure a broad and seamless reach across all kinds of web clients, from phones to powerful Unix workstations. As the diversity of web-connected devices grows, so grows the need to provide dynamic conversion, such as XML-to-HTML and XML-to-WAP, for e-business applications.
The World Wide Web Consortium has defined eXtensible Stylesheet Language. (XSL) as a standard method for addressing both XML-HTML and XML-XML conversions. There are several freely available and commercial XSL processor implementations for java and C/C++ e-business applications. However, standards-compliance, stability and performance vary widely across implementations. Additionally, even the fastest current implementations are much slower than necessary to meet the throughput requirements for either B2C or B2B applications. The great flexibility provided by XML encoding generally means that such conversions are complex and time-consuming.
The XSL World Wide Web Consortium Recommendation which addresses the need to transform data from one XML format into another or from an XML format into an HTML or other “output” format, as currently specified includes three major components in an XSL processor: an XSL transformation engine (XSLT), a node selection and query module (XPath), and a formatting and end-user presentation layer specification (Formatting Objects). XML-to-XML data translation is primarily concerned with the first two modules, while the Formatting Objects are most important for XML-to-HTML or XML-to-PDF document rendering. A typical XSL implementation comprises a parser for the transform, a parser for the source data, and an output stream generator—three distinct processes. Known XSL transformation engines (XSLT) typically rely on recursive processing of trees of nodes, where every XML element, attribute or text segment is represented as a node. Because of this, implementations suggested in the prior art simply optimize the transformation algorithms and will necessarily result in limitations on performance.
An XSL stylesheet is itself an XML file that contains a number of template-based processing instructions. The XSLT processor parses the stylesheet file and applies any templates that match the input data. It operates by conditionally selecting nodes in an input tree, transforming them in a number of ways, copying them to an output tree and/or creating new nodes in the output tree. Known XSLT implementations suffer from terrible performance limitations. While suitable for java applets or small-scale projects, they are not yet fit to become part of the infrastructure. Benchmarks of the most popular XSLT processors show that throughput of 10-150 kilobytes/second is typical. This is 10 times slower than an average diskette drive and roughly equivalent to a 128 Kbit/s ISDN line. Many websites today have sustained bandwidths at or above T1 speeds (1500 Kbit/s) and the largest ones require 100 Mbit/s or faster connections to the Internet backbone. Clearly, unless XSLT processing is to become the chief performance barrier in B2C and B2B operations, its performance has to improve by orders of magnitude.
There are a number of reasons for such poor performance. To transform one XML vocabulary to another, the processor must parse the transform, parse the source data, walk the two parse trees to apply the transform and finally output the data into a stream. Some of the better implementations allow the transform parsing as a separate step, thereby avoiding the need to repeat that step for every document or data record to be processed by the same transform. However, the transformation step is extremely expensive and consumes an overwhelming portion of processing time. Because XSLT relies on recursive processing of trees of nodes, where every XML element, attribute or text segment is represented as a node, merely optimizing the implementation of the algorithms cannot attain the necessary results. Thus current state-of-the-art XSLT implementations have to sacrifice performance in order to maintain the flexibility that is the very essence of XSLT and XML itself. So while XML and XSLT offer greater flexibility than older data interchange systems through the use of direct translation, self-describing data and dynamic transformation stylesheets, this flexibility comes with a great performance penalty.
Other known transformation or translation solutions implement “middleware” translation mechanisms. As represented in FIG. 1, in the middleware solution of the prior art, a large number of different platforms A-F, 101, 107 each may be arranged to communicate with each other. Each platform implements a format translator 103 to convert data streams between the local platform 101 and an agreed or common middleware format Z. The data stream in format Z can then be exchanged with any other node in the network. Each receiving node 107 then uses its own platform specific translator 105 to convert the data streams into a format preferred by the receiving node. Disadvantageously, such solutions require platform specific static drivers for each format. Conversion is laboriously performed by converting from the first platform format or protocol (A) to the common middleware format (Z) and then converting from the middleware format to the second platform protocol. In addition to the deficiencies in terms of time to effect such conversions, if formats change there is a need to stop or interrupt platform operations and install modified drivers in accordance with the format change(s). So while performance is often better than that of XML/XSLT solutions, flexibility is almost non-existent; performance is also considerably worse than that possible by using direct translation operating on the same formats.
Direct translation between two different formats or, more generally, two different protocols is the oldest method of achieving data interchange. By writing custom computer source code that is later compiled and installed on the target platform, it is possible to achieve interoperability between two different data formats. If the source code is carefully tuned by someone very skilled in the art, the resulting translator will be a high-performance one. However, it will not work if any change in data format or protocol occurs, and will require additional programming and installation effort to adapt to any such change. Direct translation can offer excellent performance, but it is even less flexible than the static adapters used by “middleware” systems.
Instead of a static adapter or custom-coded direct translator, it is the use of some kind of data or protocol description that can offer greater flexibility and, thereby, connectivity. U.S. Pat. No. 5,826,017 to Holzmann (the Holzmann implementation) generically describes a known apparatus and method for communicating data between elements of a distributed system using a general protocol. The apparatus and method employs protocol descriptions written in a device-independent protocol description language. A protocol interpretation means or protocol description language interpreter executes a protocol to interpret the protocol description. Each entity in a network must include a protocol apparatus that enables communication via a general protocol for any protocol for which there is a protocol description. The general protocol includes a first general protocol message which includes a protocol description for a specific protocol. The protocol apparatus at a respective entity or node in a network which receives the first protocol message employs a protocol description language interpreter to interpret the included protocol description and thereby execute the specific protocol.
Again, disadvantageously, the Holzmann implementation requires a protocol apparatus at each networked entity to interpret the protocol description. That is, the implementation is “node-centric” in that each node requires and depends on a respective translation function to a predetermined and fixed target format. Clearly, if one has the ability to equip every node in the network with a protocol interpreter such as the one described, one could conceivably equip every node in the network with a much simpler standard protocol stack to enable communication. On vast global networks, such as the Internet, it is practically impossible to change all network nodes over to a new protocol or data format—and this in turn drives the need for data interchange methods and devices.
Additionally, the implementation involves interpretation of protocol descriptions, which is a very resource-consuming process. The trade-off of Holzmann is quite similar to that made by XML/XSLT: by using self-describing data packets and a generalized interpreter, the implementation sacrifices a great deal of performance to achieve better flexibility and interoperability. Also Holzmann does not address the needs of next-generation Layer 6 and Layer 7 protocols (such as those based on XML-encoded data) for protocol translation, dealing instead with lower-level (Layer 3) protocols only.
The existing solutions to the general problem of data exchange between disparate systems and enabling connectivity between networked applications, provide either performance or flexibility, but never both.
Further disadvantages of the existing solutions include the fact that their performance is limited by the requirements of static interpretation between limited sets of static constructs. The higher the performance of the typical interpreter, the less flexibility its designers permit in the specifications of the formats. Also, even where the prior art has made provisions for adapting a format specification to changes, only one side of a specification can be changed while the other remains fixed. However, this generates a further disadvantage since it creates a “node-centric” system requiring all nodes to be changed in order to accommodate each new format specification. In addition, the typical data translators that operate as interpreters are relegated to the more stable protocols in the lower layers of the OSI model, thus severely limiting their usefulness in a rapidly changing environment.
Other disadvantages occur from the potential size of XML encoded data, a limitation that applies to all types of markup language encoded data including SGML, HTML and their derivatives. In the case of XML transforms, for example in XSLT/XQuery/XPath, the process takes as input one XML document and translates it into another. Although it is designed to work with XML, in effect it is a generic tree-based transformation description.
There are two general techniques for processing XML data. The first technique, often called the DOM (Document Object Model) method, reads all of the XML data and forms the corresponding tree in memory (commonly referred to as a DOM tree). After building the tree, the application program processes the data. The processing will consist of one or more traversals of the data in arbitrary order. As it performs these traversals, the application program generates the output in the form described the XSLT/XQuery/XPath data map. Typical examples of output are XML data, HTML documents, or text.
The DOM model has the advantage of generality and speed. However, it suffers from the requirement that the entire XML file must be consumed and stored in memory as a DOM tree before output can start being produced (alternatively, before schema validation can occur). XML encoded data is already bulky, and the need to consume an entire file puts great strains on memory consumption. As an example, a data file with XML encoded yellow page entries that is to be converted into HTML or SGML format would require a processor to consume and store in memory the entire XML data file before any output could be generated, even though the DOM tree may not have a very large number of levels.
The alternative method, called the SAX model, consists of combining the reading of the data with the processing. This is called streaming, the basic idea is that instead of reading the entire input into memory before beginning the transform, the input is processed as it is parsed. As each node is read the processor processes the data and then typically deletes the storage for that data. In other words the tree is not built. Instead the data is processed on-the-fly. When possible, this technique has the advantages of using less memory and lower latency. The memory savings come from the fact that one need not store the entire tree at any one time, but only pieces. The improved latency is due to the fact that output can be produced immediately rather than waiting for the entire tree to be parsed. The XML file may be huge and use only a limited amount of memory.
However at the present time there are not many ways to produce a streaming XML transform. Generally it is done by handwriting code in a general purpose programming language such as Java, using an API similar to SAX. This is tedious and error prone; the problems of writing XML transformations in general purpose programming languages are what led to XSLT's widespread adoption in the first place. There have been some efforts to design transform descriptions similar to XSLT that are specially crafted for writing streaming programs. STX is one example. However these languages are unfinished and none have acquired a widespread user base as of yet. Most significantly, they require network administrators or application developers to rewrite their existing applications in a different language to take advantage of possible benefits of a streaming processing model.
Accordingly, there is a need to allow to allow streaming XML transforms without losing the advantages offered by the wide adoption and the convenient tree model used by XSLT/XPath/XQuery, and related languages.