Field of the Invention
Embodiments of the invention relate to a comprehensive framework for composing and executing analytics applications in business level languages. The comprehensive framework may include information and services directory, an analytics integration workbench, and an analytics integration server.
Description of the Related Art
Comprehensive risk assessment or fraud detection, and analytics applications in general, require access to operational data which may be distributed across independently created and governed data repositories in different departments of an organization. For example, an analytics solution may require data from multiple distributed repositories, external relational or structured sources, internal and external sources of unstructured and semi-structured information, real time external sources such as market data feeds, and real time internal sources such as application and information technology infrastructure events. Further, a particular analytics solution may require that data from these separately managed data sources be fused or combined to create a complete and trusted view to derive better insights.
Getting data from these diverse sources to the warehouse and data marts is often a complicated task. For example, a data architect for risk information may collaborate with a risk analyst to identify what data (often defined using industry standard models or glossaries, which provide a standardized taxonomy of risk data in business terms) and additional risk data (in terms of business level descriptions) should be provisioned in a data warehouse or OLAP cube for use by an analytics application. Once the needed information is identified, the data architect generates schemas for the data in the data warehouse and schemas for OLAP dimensional tables. The architect may then work with the database software developers to compose the data movement scripts (ETL programs) to actually move the data from their respective sources within the enterprise to the data warehouse. That is, the programs to actually obtain the data needed by the analytics solution are created. Once the ETL programs are developed and deployed, the ETL processes may be used to populate the data warehouse and OLAP cube. Only then may the risk analyst access data from the warehouse as needed for a given analytics solution. For example, data from the warehouse are populated in the OLAP cube and data marts for use in various reports and dashboards.
Given this wide distribution of data, a large percentage of the resources devoted to a typical analytics solution are spent in provisioning data for the analytics application.