Typically, XML and XML-based markup languages are extensible. In some systems, the extensibility mechanism is based on the concept of known namespaces and unknown namespaces which can be implemented by any arbitrary agent that processes the markup. Unknown namespaces can contain arbitrary extensions that a particular agent may or may not understand. When multiple agents are chained together to form a pipeline where a stream of markup is passed through each single agent sequentially, and the agents modify the stream somehow, the introduced extensibility mechanism presents a challenging problem known as “markup preservation”—that is, preserving the markup that any one particular agent may not understand.
In some systems, the set of known and unknown namespaces can differ between the individual agents in a pipeline. In such a system, it becomes important for any agent that comes before another to preserve the markup from any unknown namespace because a subsequent agent that understands the namespace may want to process content from the namespace.
Traditional processing agents for XML-based markup languages often choose to implement a document object model (DOM) tree. A DOM tree is a weakly typed structure that contains one node for each element tag found in the markup. Because every node is weakly typed, it is a fairly straightforward process for an agent to create nodes of unspecified type that hold markup for unknown namespaces, skip these nodes during processing, and then subsequently serialize them back to markup which is passed to the next agent.
In strongly typed environments, processing unknown namespaces is not as straightforward. More specifically, in strongly typed environments, it is difficult if not impossible to preserve markup associated with unknown namespaces because no definitive type can be assigned to that mark up. A further complication can exist when, for example, XML parsers further process the markup by translating the markup into an intermediate different form, such as a binary form. In such situations, agents using the markup typically cannot modify this translation step.
Accordingly, this invention arose out of concerns associated with providing solutions to the markup preservation problem in strongly typed environments and/or those environments that involve an intermediate translation stage.