1. Field of the Invention
The embodiments of the invention generally relate to data management systems and methods, and more particularly to data management systems and methods in a distributed or grid computing environment.
2. Description of the Related Art
Many applications involve integrating data from multiple data sources and their replicas distributed across multiple nodes. In a distributed (grid) environment, these data sources can be fairly dynamic in two respects: First, with regard to replication, the data sources can be replicated in whole or in a subset according to the query workloads and the system loads. These new replicas are created and destroyed by independent entities such as the placement manager, independent of the applications accessing the data. Second, with regard to source failure/addition, distributed data sources are often difficult to maintain under a centralized administrative control. As such, new data sources can be independently registered to a grid or may leave the grid. Additionally, the data sources can also fail independently.
FIG. 1 illustrates a conventional federation through wrappers system 30. Federated DBMSs (data base management systems) 32, which receive queries (Q) from an application 31, typically access remote data sources 34 through specialized connector modules, variously referred to as wrappers or gateways 36a, 36b, 36c. Each wrapper 36a, 36b, 36c encapsulates its corresponding remote data into a tabular structure so that the federated DBMS 32 collates data from the wrappers 36a, 36b, 36c via relational operators. Traditionally, there is one connector module for each remote data source 34.
Conventionally, applications 31 are generally programmed to access specific data sources 34, and so any dynamism at the data source 34 can be tackled only by reprogramming the application 31. For example, SQL (Structured Query Language) applications hardcode the data sources in the nicknames used in their “FROM” clause. Data binding techniques exist, such as those described in Vidal et al., “A Meta-Wrapper for Scaling up to Multiple Autonomous Distributed Information Sources,” Proceedings 3rd IFCIS International Conference on Cooperative Information Systems, pp. 148-157, New York, N.Y., 1998, the complete disclosure of which, in its entirety, is herein incorporated by reference.
These techniques tend to bind data sources 34 to a query (Q) and make data sources 34 transparent to applications. However, this binding is performed statically, when the query (Q) is compiled, whereby a description of the capabilities of each data source 34 is statically stored within the DBMS 32. Moreover, the data sources 34 can change or fail after the query (Q) is compiled. Thus, while the conventional data management systems and methods were adequate for the purposes they were designed for, there remains a need for a novel technique of data source binding which includes factors that are unknown at the time the query is compiled.