One of the issues in any organization is the accumulation of data over time. As an organization develops, records are kept of all data that is deemed important to that organization. Personnel records, sales reports, and client lists are all examples of the myriad of types of data that organizations collect. The advent of computerized databases has greatly facilitated the recording and collecting of data. However, mere collection of data on its own is of little value. The greater value lies in the ability to review the data and subject it to further analysis for performance evaluation and future planning.
This is especially true in the case of operational databases. Operational databases are employed by operational systems to carry out regular operations of an organization that are often transaction-based. In order to better address the needs of a particular application, such databases are built on online transaction processing (“OLTP”) models, wherein the efficiency of historical analysis of the data is sacrificed for operational agility.
Given increasing trends towards specialization, it is common for separate areas of an organization (departments, offices, etc.) to maintain their own databases. Thus, for example, a business may have a personnel database maintained by one department, a sales database maintained by another department, an accounting and/or payroll database maintained by a third department, and so forth. As each database is separately maintained, and has a separate purpose, interoperability and data exchange between the separate databases becomes more difficult, at least in part due to differences in data formats. Where at least some of these databases are operational, the challenges faced become greater.
At the upper levels of the business, there is a need for executives and managers to track performance metrics across the entire organization in order to make both short-term and long-term decisions. In order to track these metrics collectively, however, a system and method of integrating data from the multiple operational databases in the organization is needed.
One solution is the use of an extract, transform and load (“ETL”) engine to extract the data from the individual operational databases, transform the extracted data into a unified data format, and load the transformed data into a single database for access. While the principles behind the ETL engine are relatively simple to understand, the implementation and execution of ETL engines have proven to be very complex. More often than not, such ETL engines are custom-designed, programmed and compiled to accommodate the individual operational databases and needs of a specific organization. This custom work is both time-consuming and costly, limiting adoption of present ETL engines as a solution. Similarly, when changes are made to the format(s) of any of the operational databases, the source code for the ETL engine must be revised to compensate for changes. Such changes typically require modifications directly to the programming code of the ETL engine, which can be time-consuming and costly, especially where the changes are being made by a party other than the original author/developer, hereinafter referred to as “the developer”. As the organization size and the number of individual databases increase, this problem becomes more significant. Once the changes are made, the revised source code must be recompiled before it can be deployed.
There are a number of scenarios where it can be desirable for an organization to obtain access to the source code of such ETL engines. For example, the relationship between the developer of the ETL engine and the organization may sour, perhaps due to the inability of the developer to deliver modifications in a timely manner to the organization in response to changes to the operational databases. The developer may cease operations for any of a number of reasons. Such scenarios generally call for the use of a source code escrow as the developer may not wish to provide direct access to the source code unless the organization absolutely requires it. The use of such a source code escrow adds a layer of additional costs. Further, even if the source code is made available to the organization, they have to secure the services of another developer to customize the source code as required to address the changes to the operational databases maintained by their organization. As will be appreciated, these changes can prove difficult and costly.
Where the requirements of an organization change, the data extracted from the various operational databases generally needs to be re-merged, cleaned and transformed. This process is manually performed as needed, requiring significant knowledge of the tools and the process. The data that needs to be reloaded and the fact data that needs to be removed or updated are manually determined. The result of this manual evaluation is a one-off script to perform the required actions. This manual process is subject to human error and is costly. Further, as this process is typically carried out during off-peak hours, it requires a skilled person to perform the manual rebuilding of the business intelligence data at a less-than-convenient time.
It is therefore an object of the invention to provide a novel computer system and method for aggregating data from a plurality of operational databases.