1. Field of the Invention
The present invention relates to markup languages, and more specifically to a method and apparatus for transforming source data in a source markup language to target data in a target markup language.
2. Related Art
A markup language is a notation for representing text intermingled with markup instructions (commonly known as tags) that indicate the role of the text, for example, about the text's structure (what the text signifies) or presentation. The text, whose role is specified by a tag, is conveniently referred to as content of the tag. An example of a markup language commonly used is the extensible markup language (XML).
There are several markup languages, potentially used to represent the same information. Such different markup languages provide different views of the same data/information by adding meaning to the way information is coded and processed. Different markup languages have evolved due to reasons such as historical evolution and lack of common standards.
There is often a need to transform data (“source data”) in one markup language to data (“target data”) in another markup language. Such a need may be presented due to applications requiring data in the corresponding markup language. Accordingly, if the source data is present in a different markup language, the target data needs to be generated in a target markup language consistent with the requirements of the application designed to process the information.
Typically, a set of transformation rules is specified for mapping the source data in a source markup language to target data in a target markup language. A processor executes a set of instructions by which source data is transformed into target data based on the set of transformation rules. For example, XML Style Language (XSL) is one of several languages used to specify transformation rules to transform source XML to target XML or HTML.
Several prior approaches are used for transformation of source data to target data based on such transformation rules. In one prior approach, a processor generates a hierarchy of memory objects representing the entire source data sought to be transformed, and applies the set of transformation rules on the data in the memory objects to generate the target data. The memory objects are stored in a random access memory (RAM) and the hierarchy is often viewed as a Document Object Model (DOM), as is well known in the relevant arts.
One disadvantage with such an approach is that the RAM size requirement may be proportionate to the size of the source data (since the entire data is represented in the hierarchy), and thus the approach may not scale to transform source data of large size, particularly when the transformation needs to be performed quickly.
In another prior approach, a processor reads the tags in the entire source data in a sequential manner (e.g., using Simple API for XML (SAX), described in further detail in the book entitled “SAX2” by David Brownell, published by O'Reilly with ISBN 0-596-00237-8.) and applies the set of transformation rules on the tags. The memory requirements are reduced due to the sequential processing of the tags. However, the overall computational complexity (number of computations required) may be enhanced due to the sequential processing of the source tags, as is also well known in the relevant arts.
What is therefore needed is an approach, which addresses one or more problems/requirements described above.
In the drawings, like reference numbers generally indicate identical, functionally similar, and/or structurally similar elements. The drawing in which an element first appears is indicated by the leftmost digit(s) in the corresponding reference number.