1. Field of the Invention
This invention relates to Extended Markup Language (“XML”) data conversion and more particularly relates to XML data conversion from one encoding format to another while streaming a portion of an XML data file.
2. Description of the Related Art
The default encoding for an XML document is the 8-bit Unicode Transformation Format (“UTF-8”). Nevertheless, XML data can use various kinds of encoding representations. In particular, while an XML document can be stored in UTF-8 encoding in a remote database, it can also be saved in a region specific encoding on a local file system. Transferring an XML document encoded in UTF-8 from a remote database to a local file system in a region-specific encoding format requires conversion of the XML document from one encoding to another.
Encoding conversion utilities, also known as transcoders, may be used to facilitate the conversion process. However, typical transcoders require an XML document to be completely available before conversion begins. This may not be a problem when an XML document is relatively small in size. In such case, an entire XML document can be transferred from the remote database and then converted into the region specific encoding on the local system. Problems arise when the XML document is large in size and the local system is constrained in memory, bandwidth, and/or storage resources. Transferring large XML documents from a remote database may cause the local system to run out of memory. Advanced databases provide streaming materialization of large XML documents allowing data to be requested on an as-needed basis. This helps local systems avoid being bombarded with a large amount of data. Unfortunately, typical transcoders do not provide necessary interoperability with streaming materialization from databases.
Conversion using Java causes other problems. One issue is that in Java there is no support for direct conversion between any two arbitrary encoding representations. An intermediate character representation is necessary to convert from one encoding representation to another. For example, an XML document in UTF-8 encoding must first be converted into a character representation, which then can be converted into a region specific encoding format.
From the foregoing discussion, it should be apparent that a need exists for an apparatus, system, and method for converting streamed XML data, representing only a portion of an XML document, from one encoding format to another using Java. Beneficially, such an apparatus, system, and method would be request driven and would use available Java converters.