1. Field of the Invention
The invention relates to database systems. Specifically, the invention relates to apparatus, systems, and methods for passing data between an eXtensible Markup Language (XML) document and a hierarchical database.
2. Description of the Related Art
Today, business applications increasingly rely on XML documents to exchange data. Generally, modem software applications communicate with each other over the Internet using XML documents as a common data interchange language for Business to Business (B2B) and Business to Consumer (B2C) communications. Technologies such as webservers, servlets, web applications, web services, and the like generally rely in some fashion of data organized according to the eXtensible Markup Language Specification.
Typically, these same software applications then communicate the data in the XML document to database servers for storage in a database. Generally, before an XML document is stored in a database, the XML document is analyzed to ensure that the XML document is a “valid” XML document. An XML schema is used to validate an XML document. As used herein, references to “an XML document” mean that the XML document is a valid XML document according to a predefined XML schema. Because an XML document provides such flexibility in the organization and types of XML elements, XML documents are validated to ensure that they are organized as expected. An invalid XML document may lead to unpredictable or erroneous results in software modules using the invalid XML document.
An XML schema defines the structure, organization, and data types that are acceptable in all corresponding XML documents. The XML schema defines a set of XML elements, XML element attributes, and organization among the XML elements that is desired. The XML schema serves as a vocabulary for the XML elements. Consequently, the XML schema defines a superset of valid XML documents. The valid XML documents include one or more of the XML elements, XML attributes, and structure among the XML elements as defined in the XML schema.
Typically, prior to storing the XML document, the XML document is validated. Generally, two types of databases may store the data in the XML document, hierarchical or relational. Each type of database has different benefits and limitations, which will be discussed in more detail below.
Generally, the databases store data or an XML document in two different formats. In one aspect, the raw data contained in the elements of the XML document are removed from the XML document and stored in the database. Data stored in this manner is referred to herein as “decomposed” data because the formatting of the XML document is removed to store only the raw data. In another aspect, the raw data including the formatting that comprises the XML document are stored in the database. When the XML document is stored in the database in this manner, this is referred to herein as storing the XML document “intact” because the formatting of the XML document or an XML sub-tree is preserved within the database.
To control costs, it is desirable that modem technologies such as XML documents be capable of readily interfacing with existing computer and information technology without significantly modifying the existing computer and information technology. For example, large corporations, governments, and other entities continue to use legacy applications, which are software programs designed, written, and maintained for large, mission-critical computers, such as mainframes. These entities have invested large amounts of work and money into developing and maintaining the legacy applications. In addition, these applications have been tested and refined to operate very efficiently and with minimal errors. Legacy applications continue to manage a high percentage of the everyday transactions and data for these businesses.
Similarly, many of these legacy applications continue to store and retrieve data using hierarchical databases, such as IBM's Information Management System (IMS), instead of common relational databases such as the Oracle database available from the Oracle corporation. To facilitate storing and retrieving data in XML documents (referred to herein as “XML data”), functionality for passing XML data between XML documents and relational databases has been developed. Generally, this functionality is integrated into the database servers for relational databases. Consequently, users' versions of the database serves must be updated to enable support for passing of data between an XML document and a relational database.
Unfortunately, no tools, either standalone or integrated, exist for passing XML documents and/or XML data between an XML document and a hierarchical DB, one example of which is IMS. Consequently, one of two conventional solutions has been implemented depending on the circumstances.
One solution is to store the XML document either intact or decomposed in a native XML database. A native XML database is one which is designed and originally built to store and retrieve XML documents. One example, of a native XML database is the Tanimo database available from the Software AG corporation of Darmstadt Germany. However, using a native XML database may require that two databases be maintained, the XML database as well as the hierarchical database. In addition, application specific software may need to be developed to move raw data between the XML database and the hierarchical database. Furthermore, the native XML databases may not yet include all the standard features and functions of conventional hierarchical databases such as data backup, indexing, speed optimizations, and the like.
Another solution is to write specific software modules that read through a specific XML document searching for elements of interest, retrieving the raw data and storing the raw data within the hierarchical database. Similarly, the software modules may be programmed to reproduce a specific XML document with the appropriate formatting and metadata for raw data within the hierarchical database. However, these software modules are inflexible and must be constantly revised as XML elements are removed, added, or modified for the XML document. In addition, developing such software may be difficult because the software must accommodate all valid XML documents for a specific XML schema. A software application may use a number of different XML schema which require a customized software module for each XML schema. Such maintenance and development can become prohibitively expensive.
Accordingly, a need exists for an apparatus, system, and method for passing data between sharing an XML document and a hierarchical database. The apparatus, system, and method should allow for storage and retrieval of XML data and/or the XML document in a decomposed or intact format within a hierarchical database. In addition, the apparatus, system, and method should allow for indexing of an XML document or a sub-tree of the XML document when the XML document or sub-tree is stored in the hierarchical database in an intact format. The apparatus, system, and method should also allow for storage and retrieval of an XML document or a sub-tree of the XML document in a mixed format of decomposed and intact. Additionally, the apparatus, system, and method should allow for passing of data between an XML document and a hierarchical database without any changes to the functionality or software of the hierarchical database. Further, the apparatus, system, and method should interface with the hierarchical database using standard external commands to the database.