1. Field of the Invention
The present invention generally relates to data processing and more particularly to managing federated queries that target data resident on more than one distinct database.
2. Description of the Related Art
Databases are computerized information storage and retrieval systems. A relational database management system (RDBMS) is a database management system (DBMS) that uses relational techniques for storing and retrieving data. The most prevalent type of database is the relational database, a tabular database in which data is defined so that it can be reorganized and accessed in a number of different ways. A distributed database is one that can be dispersed or replicated among different points in a network. An object-oriented programming database is one that is congruent with the data defined in object classes and subclasses.
Regardless of the particular architecture, in a DBMS, a requesting entity (e.g., an application or the operating system) demands access to a specified database by issuing a database access request. Such requests may include, for instance, simple catalog lookup requests or transactions and combinations of transactions that operate to read, change and add specified records in the database. These requests are made using high-level query languages such as the Structured Query Language (SQL). Illustratively, SQL is used to make interactive queries for getting information from and updating a database such as International Business Machines' (IBM) DB2, Microsoft's SQL Server, and database products from Oracle, Sybase, and Computer Associates. The term “query” denominates a set of commands for retrieving data from a stored database. Queries take the form of a command language that lets programmers and programs select, insert, update data, and so forth.
Often, the data may actually reside in more than one database (i.e., located on more than one database server). For example, a patient's records (diagnosis, treatment, etc.) may be stored in one database, while clinical trial information relating to a drug used to treat the patient may be stored in another database. Therefore, to access the data, a federated query may be generated that targets each of these distinct databases. As used herein, the term federated query generally refers to any query that requires combining results of queries run against distinct databases. Because the distinct databases may be on different servers, to receive valid results, each of the different servers must be available. In a conventional federated database system, if any of the targeted database servers are unavailable, the query will fail, typically due to a timeout (e.g., failure to receive results from the unavailable server within a specified time limit).
Failure of the query due to unavailability of one of the targeted database servers may lead to unnecessary server activity. For example, a federated query may be parsed into separate subqueries to be run against each of the targeted databases (each on a separate server). If the subqueries are each run against the targeted database and one or more of the targeted database servers are unavailable, the query will eventually fail although the available database servers may return valid results. In other words, the activity on the available database servers to process their respective queries is wasted.
For complex queries, the processing time on a targeted database server may take minutes or even hours, and may put a significant load on the server resources. If a user was made aware of the unavailability of one or more of the targeted servers, the user may choose not to run the query. Alternatively, the user may choose to modify the query to target only available servers. However, in conventional federated database systems, there are no explicit mechanisms for determining the availability of targeted databases prior to running the federated query, for reporting database server status back to the user, or for taking action to manage the federated query.
Accordingly, there is a need for an improved method for building and running federated queries that can monitor the availability targeted servers.