In order for a client process to be able to access and manipulate data, a client may request that a program (henceforth called a Data Access Service or DAS) convert the data into a hierarchical (or other graph) structure. WO0221291 discusses conversion of certain HTML files into tree structures such that the information contained within the tree structures may be used by other application programs. WO2004068320 discusses the conversion of HTML source into a tree structure such that that tree structure can be manipulated and transformed into a simplified HTML document.
The conversion of other data formats (such as XML) into hierarchical format for the purpose of data manipulation is also known. U.S. Pat. No. 6,785,685 for example describes the parsing of XML data in order to build a DOM tree from which a dynamic data object (DDO) or extended data object (XDO) can be constructed. US2003041077 is another example of a patent that discuses the conversion of a source document into hierarchical form in order that the data contained within can be referenced. Other graph structures are known such as SDO and Microsoft's ADO (Microsoft is a registered trademark of Microsoft Corporation in the United States, other countries, or both).
A simplified overview of the processing required to construct and access a graph is illustrated by FIG. 1a. Client 10 requests DAS 20 make a graph 30 out of data “data”. In the case of XML data, this may be achieved by a “Simple API for XML” (SAX) parser parsing the XML data to create events which DAS 20 can then use to build the graph 30.
As shown in FIG. 1b, when client 10 wishes to access a node within the graph 30, client 10 makes such a request (via an operation) to the graph itself. The operation traverses the existing graph according to the supplied path until the requested node is identified and sends this back to client 10 for manipulation. US2004193575 discusses modelling of an XML document as a tree of nodes and navigating the tree of nodes to address parts of the XML document, where a destination node is as a result of a path expression. The reader is also referred to the discussion of XPath at http://www.w3.org/TR/xpath.
FIGS. 1a and 1b address the situation where the complete graph is built immediately or “eagerly” when requested by the client. (For more detail, the reader should refer to ftp://www6.software.ibm.com/software/developer/library/j-commonj-sdowmt/ComSDO monj-SDO-Specification-v1.0.pdf.) This can however be processor intensive, especially when the client may never access every node in the graph.
FIGS. 2a and 2b show a “lazy” solution. As before, client 50 requests that a DAS 60 converts some data “data” into a hierarchical format. A parser within the DAS parses the data to create an event pertaining to the root of the graph 70. The graph then builds the root node from this event and creates and instance of a store 80 containing a buffer 90. The graph's root node points to this store. The graph then adds the “data” into buffer 90.
Nodes are only built when a client specifically requests them. For example, FIG. 2b shows that client 50 issues a request for node “b/c”. This request is received by graph 70 which points to store 80 containing buffer 90. Store 80 parses the buffer to produce the events required by the graph in order to build the nodes in the path to the requested node. In the present case this produces graph 70′. Once the requested node has been created by the graph, this is then sent back to the client 50.
Thus a better performance can be achieved by building the graph on demand rather than by expending processing power up front.
Use of a store to build a graph on demand is described in the EMF javadoc found at http://eclipse.org/emf/. The base technology is also described at: http://xml.apache.org/xerces2-j/xni-config.html
In certain circumstances, a client may require the data to be in a different format to that in which it is currently stored. Numerous patents/patent applications discuss the concept of data transformation. See for example US2002073119, WO0073941 and US2004025117.
Transformation of data can be achieved by a transformation engine. There are two logical operations a transformation engine might be performing, “transcription” (i.e. transcoding) in which the same logical information is expressed in a different “wire format”. In general a client would do this when it intends to forward the message to another agent. An example would be translating from English to French or XML to a legacy (or “cherished”) application format. The second which is a logical transformation, changes the logical meaning of the graph, for instance it might involve changing routing information in a message.
The present invention is particularly concerned with the process for achieving data transformation when the data to be transformed is constructed lazily.