The current business environment is very different from what it was just a few years ago. Today's organizations embrace the global marketplace, and this dictates a need to be able to efficiently operate at all times. Customers are now more sophisticated which translates into an accelerated pace of business and decision-making processes. Further, business relationships have become highly dynamic, and customers expect businesses to adapt quickly.
Technical and operational challenges abound as well. There is a need to support multiple applications on a variety of platforms, and to integrate with companies using the Internet, extranets, business to business (B2B) exchanges, and other resources.
Businesses have typically used a variety of mechanisms to control and analyze business operations such as accounting, payroll, human resources, employee tracking, customer relations tracking, etc. Tools which provide these functions are often implemented using computer software. For example, a software package may manage business accounting, another software package might be responsible for receiving new orders, yet another software package will track warehouse inventory and still another package may handle order fulfillment and shipment. In another example, a business software package operated by one business will need to exchange data with a software package operated by another business to allow a business-to-business transaction to occur.
When business tools are implemented in software, it is not unusual for proprietary software packages to be responsible for each individual business task. However, this implementation is cumbersome and requires the same data to be entered in differing formats among the various business applications. In order to improve efficiency, integration applications have been developed which are used to integrate various elements of one business application with elements of another business application.
For example, if a software package, which is used to obtain new orders, includes data fields (or “entries”) referred to as CustomerNameLast and CustomerNameFirst, it is a relatively straightforward process to map those entries to an accounting software program having the data fields BillingAddressFirst and BillingAddressLast. In such an integration system, the relationship between entities in one system (i.e., computer system or application) and entities in another system can be stored in tables. A system administrator can configure entity mapping between the systems by selecting between the various entities of the two systems.
An integration server needs to deal with various types of messages. One type of messages is based on the standard XML format. XML message typically have to be transformed into a format so that line-of-business (LOB) applications can understand and thus process the resulting messages. As a transformation standard recommended by the World Wide Web Consortium (W3C), the Extensible Stylesheet Language for Transformations (XSLT) plays an integral role in the business of integration with input being XML messages. Also, a standard by W3C, Xml Path Language (XPath) provides a syntax to describe and retrieve parts of XML messages.
Any platform that supports XML may also require an XSLT transformation and XPath node-set selection engine. An XPath node is one of the components that forms an XML document. One typical way of handling an incoming XML message is to load it in its entirety into memory by maintaining relationships between nodes in the message, and then the in-memory representation is processed accordingly. However, large messages may not necessarily be loaded in memory all at once. Messages with a large memory footprint most likely downgrade the performance of the system, if not halt it. This may result in an end user either seeing an out-of-memory error or messages not being processed correctly.
However, XPath queries fundamentally require navigation capabilities inside the source document (e.g., from a node, move to the parent node, or to a sibling node, or to a child node). These operations are easily implemented in memory where the respective nodes are linked using memory pointers. However, in the source serialized XML documents, such navigation is very inefficient because records do not have fixed sizes nor do nodes in the serialized XML messages contain pointers to their parents or siblings.