The eXtensible Markup Language, otherwise known as XML, has become a standard for inter-application communication. XML messages passing between applications contain tags with self-describing text. This self-describing text allows messages to be understandable not only to the applications, but to humans reading an XML document as well. XML is currently used to define standards for exchanging information in various industries. These document standards are available in various forms.
Several XML-based communication protocols exist, such as the Simple Object Access Protocol (SOAP) and the ebXML protocol. The ebXML protocol is an open XML-based infrastructure that enables the global use of electronic business information. SOAP is a lightweight XML protocol, which can provide both synchronous and asynchronous mechanisms for sending requests between applications. The transport of these XML documents is usually over a lower level network standard, such as TCP/IP.
XML documents need to be valid and well-formed. An XML document is considered to be “well-formed” if it conforms to the particular XML standard. An XML document is considered valid if it complies with a particular schema. At the core of an XML document is an XML parser, which will check to verify that a document is well formed and/or valid.
The processing of XML has become a standard function in many computing environments. When parsing XML, it is necessary to get data from the XML file and transform the data such that the data can be handled by a Java application or other application running the parser. Efficient XML processing is fundamental to the server. As more and more documents become XML based, more and more traffic on the server will be in XML. The latest push into web services (with SOAP as the transport) has also highlighted the fundamental need for fast XML processing. Web services use XML over HTTP as the transport for remote procedure calls. These calls cannot be done in a timely manner if the XML parser is slow. There are primarily two standard approaches for processing XML: (1) SAX, or Simple API for XML, and (2) DOM or Document Object Model. Each protocol has its benefits and drawbacks, although SAX presently has more momentum as an XML processing API.
XML data binding is a process whereby XML documents can be bound to objects that are designed especially for the data in those documents. Data binding allows applications to manipulate data that has been serialized as XML in a way that can be more natural than DOM. Data binding can also have many cross-system dependencies. Web services and XML parsing are examples of clients or applications that can utilize data binding.
One method that is useful for XML data binding is JAXB, or the Java™ Architecture for Data Binding. JAXB compiles an XML schema into Java classes, which handle XML parsing and formatting. These generated classes also ensure that the constraints expressed in the schema are enforced in the resulting methods and Java language data types. Presently, however, there is not a solution that allows not only mapping from XML to Java, but also from Java to XML.
Castor XML is an existing, open source data binding framework for Java to XML binding. Castor enables one to deal with the data defined in an XML document through an object model which represents that data, instead of dealing with the structure of an XML document like DOM and SAX. Castor XML can marshal many Java objects to and from XML. Marshalling, and the inverse operation of unmarshalling, involves converting a stream of data, or sequence of bytes, to and from an object. Marshalling converts an object to a stream, while unmarshalling converts from a stream to an object. Castor, however, is not a complete solution for applications such as web services.