1. Field of the Invention
The present invention generally relates to database management systems, and, more particularly, to mechanisms within computer-based database management systems for mapping disparate-data residing in multiple data sources or generated dynamically into a single, reusable software component accessible to application developers.
2. Description of Related Art
The increasing popularity of electronic commerce has prompted many firms to turn to application servers to deploy and manage their Web applications effectively. Quite commonly, these application servers are configured to interface with a database management system (DBMS) for storage and retrieval of data. This often means that new Web applications must work with “legacy” environments. As a result, Web application developers frequently find they have little or no control over which DBMS product is to be used to support their applications or how the database is to be designed. In some cases, developers may even find that data critical to their application is spread across multiple DBMSs developed by different software vendors.
The e-commerce community commonly uses entity Enterprise JavaBeans (EJBs) when persistence is required, that is, when data associated with Java objects must continue to exist (or persist) beyond the boundaries of an application session. Most frequently, entity EJBs use a relational DBMS for such storage purposes. EJB developers can create one of two kinds of entity EJBs: those with container-managed persistence or those with bean-managed persistence. Container-managed persistence is often favored, as it relieves the bean developer from writing the data access code; instead, the system running the container in which the EJB resides will automatically generate and execute the appropriate SQL as needed. By contrast, entity beans with bean-managed persistence require the developer to code and maintain his/her own data access routines directly. This allows for more flexibility, but requires additional programming skills (such as greater knowledge of DBMS technology), increases labor requirements for bean development and testing, and potentially inhibits portability of the bean itself. Unfortunately, firms intent on using container-managed entity EJBs (CMP entity beans) for their e-commerce applications may encounter some stumbling blocks. The firm's Web application server of choice may not support the firm's DBMS of choice. Furthermore, if design requirements call for a CMP entity bean whose attributes must span multiple “legacy” DBMSs, this almost certainly will not be supported.
Presently, there is no possibility to map data that are dynamically generated or residing in multiple data sources into a single, reusable software component accessible to application developers. As an example, we may consider the situation in which a Java application developer needs to build a Web-based application that accesses critical data present in multiple data sources, each of which may reside on different systems and may store data in different formats. Moreover, the developer might wish to perceive data in these sources as a single Java object, as doing so would greatly simplify design, development, and maintenance issues. As a result, s/he might want to model this single Java object as an entity bean, Enterprise JavaBean (EJB), that uses container-managed persistence (CMP). Since EJBs are standard Java components supported by a variety of leading information technology vendors, they offer many potential business benefits, such as increased portability and high degrees of code reuse. Those EJBs that are container-managed place a minimal programming burden on developers.
Unfortunately, current vendor support for CMP entity beans involves access to only a single data source per bean. Thus, the developer is forced to turn to more complex (and potentially cumbersome) alternatives to gain access to needed data sources. Often, the alternatives are more costly and time-consuming to implement, require a more sophisticated set of skills to implement, and may consume additional machine resources to execute.
One presently available solution to this problem, when a Java application developer needs to build a Web-based application that accesses critical data present in multiple data sources, involves manually simulating transparent access. In that case a programmer takes on the burden of writing the software to individually connect to each of the necessary data sources, read in any necessary data, correlate (or join) the results read in from multiple data sources, perform any necessary data translations, etc. This is a substantial amount of work and is well beyond of the skill level of many programmers. Furthermore, it incurs a great deal of cost.
Moreover, a developer would have to forego the use of CMP entity beans and instead employ entity beans with bean-managed persistence (BMP). These are more time-consuming to write, as well as more difficult to debug than CMP entity beans. In addition, they require considerable knowledge of the application programming interfaces (APIs) of each data source involved and afford less opportunity for query optimization, which may inhibit performance.
Another presently available solution to the problem calls for a physical consolidation of the data, where the data from different data sources have to be copied into a single data source, which a programmer will then access. However, this raises issues involving data latency and added cost. Due to the data latency, copies of data will be slightly to significantly “older” than data contained in the original data sources. Working with out-of-date (and potentially inaccurate) data can be unacceptable to many applications. Increased costs include software costs, since additional software must be purchased, installed, configured, and maintained to copy data from one source to another on a scheduled or periodic basis, as well as the labor costs involved with it. The software must support data migration effort or implementing a data replication process that supports very low data latency.
Therefore, there is a need to provide a method and a system which can map disparate data residing in multiple data sources into a single, reusable software component, accessible to application developers. This would simplify the design, development, and maintenance of applications and, in some cases, provide applications with a function that would otherwise be inaccessible.