1. Field of the Invention
This invention relates to database query techniques and more specifically to optimizing data transformation from relational databases to hierarchical structures.
2. Background of the Invention
Data sets are frequently communicated or delivered in hierarchical data structures. Such hierarchical data structures can be stored in structured documents, such as eXtensible Markup Language (XML) documents. XML documents, for example, are widely accepted by various processing programs and data exchange systems, wherein the data in the XML document is used directly or transformed into a data structure used by the receiving program or system.
In contrast to the communications and delivery of data, database systems are generally used to store and manipulate data. Relational database systems are a popular type of database system due to the many widely known benefits to storing and manipulating data stored in relational databases. Relational databases are generally maintained by software systems that are referred to as Relational Database Management Systems (RDBMS). RDBMSs are generally able to be distributed among two or more computer nodes that are able to be physically and even geographically separated. An enterprise is also able to distribute data among multiple RDBMSs that are hosted on different computers and retrieval of a complete set of data for a particular request in such enterprises then requires access to the multiple RDBMSs. This can consume significant computing and communications resources.
A common data manipulation process is the publishing of data out of a database in an XML format. Retrieving the data from the relational database and delivering that data in a hierarchical structure format, such as in an XML document, results in inefficiencies. Such operations typically begin with a definition of the hierarchical data structure to be produced, and an identification of data to be retrieved from one or more RDBMSs is then associated with each node of that hierarchical data structure definition. Retrieval of data from the relational database often requires a first query to determine the parameters of other queries required for the hierarchical data output. For example, retrieving a list of salaries for all employees in a particular department requires first retrieving the list of employees in that department, and then forming a query operation for the salaries for those employees.
RDBMSs, particularly RDBMSs that are used to maintain complex data sets, generally consume significant resources for each separate database query. Resource consumption for database queries is especially high when one or more database components are stored remotely and a remote query operation is required to retrieve some or all of the necessary data. The repetitive queries used to completely retrieve the data required for hierarchical output data can therefore result in large resource consumption.
In order to allow more efficient publication of data from a relational database to a hierarchical data structure, a more efficient way to retrieve data required to create hierarchical data structures from data contained in relational databases is needed.