The present invention relates generally to computer database systems, and specifically to a system for organizing information from a variety of systems in a data warehousing environment.
Few could foresee the rapid development of computer technology just a few years ago. Computers now have a place in our homes, our offices, our schools and even the our briefcases and satchels. As computer automation continues to impact an ever increasing portion of our daily lives, governments, businesses and individuals have turned to database technology to help them manage the xe2x80x9cinformation explosionxe2x80x9d and the exponential proliferation of information that must be sorted, assimilated and managed on a continuing basis. One area of importance to the database design field is data model selection for database applications.
A data model represents the structure or organization of data stored in the database. It enables the use of data in certain forms and may limit the data being used in other forms. Different applications usually require different data models. Many different data models can exist, and they usually differ markedly from one another. Typically, database applications are customized to a particular data model of a particular database. Different database vendors base their products on different data models, adding to the confusion. Usually, these applications must be re-implemented for different databases, even though the functioning of the application remains the same.
Presently, database developers have turned to data warehousing technology to resolve often conflicting data management requirements. Traditional data warehousing approaches focus on decision support applications, which emphasize summarized information. While perceived advantages exist, an inherent disadvantage to these systems is that the customer""s identity is lost. Traditional approaches exhibit shortcomings when applied to applications such as customer data analysis. Customer data analysis is a decision support analysis that correlates data to customers"" activities, events, transactions, status and the like. Summarized information usually loses the detail level of information about customer identity, limiting the usefulness of traditional data warehousing approaches in these types of applications.
What is needed is a system for providing a database that can be customized to fit individual user needs, yet also able to support data analysis applications.
According to the invention, techniques for organizing information from a variety of sources, including legacy systems, in a data warehousing environment are provided. In an exemplary embodiment, the invention provides a system, including computer code, for analyzing data from one or more data sources of an enterprise. The system provides a meta-model based technique for modeling the enterprise data. The enterprise is typically a business activity, but can also be other loci of human activity. Embodiments according to the invention can translate data from a variety of sources to particular database schema in order to provide organization to a data warehousing environment.
The system comprises a computer readable storage device for containing program code that can perform a variety of tasks, such as providing a meta-model for an enterprise. The meta model can describe at a high level the information used by the enterprise. Meta models can describe relationships between groups of entities in a data model. Entities in a data model can comprise particular data types, and the like. The enterprise can be a business activity, and/or the like. Code that can form a data organization from the model is also included in the system. The data organization can include data schema and the like. Data schema define aspects of the database, such as attributes, domains and parameters, and the like, to a database management system (DBMS). The system can create one or more databases for containing the data. Code for translating data from one or more sources to the data organization is also part of the system. The system includes code for incorporating the translated data into the database. The system can also include code for analyzing the data in the database. Accordingly, the system can provide an environment for analyzing information about customers, business processes and the like.
In another aspect of the present invention, techniques for data warehousing are provided. In a particular embodiment, the invention provides a computer program product for creating one or more databases for organizing information from one or more sources. Embodiments can organize the data in the database according to a data schema, such as a reverse star schema. A reverse star schema model comprises an identity element (e.g., core components, and the like) and one or more entities that describe classifications of data (e.g. customer classification components, and the like), which can have one or more relationships with the identity element. In an exemplary embodiment, customer classification components provide different ways to categorize customers or different business views of the customers, for example. For example, customers can be categorized by geographic region, demographics and the like. The computer program product comprises a computer readable storage device for containing a variety of code. Code for selecting a data model template from pre-defined ones based upon one or more business requirements can be included in the system. The computer program product can also include code for selecting customer entities from pre-defined ones that fit the application based on their business processes and operations. The entities can be selected from a focal group, for example. In a particular embodiment, focal groups can describe information about customer characteristics, profiles, business related classifications, customers"" roles, definitions and the like in a variety of business functional areas.
The computer program product can include code for defining entities for transactions and/or events and their attributes to form a customized group of customer activity components that are relevant to a particular application. The events can be arranged into customer activity components. These components can be organized into one or more customized groups that correspond to various operations and/or transactions. As event transactions can be scattered over time, these components can comprise a set of business measures and attributes. These events can be independent as well as dependent from one another. A particular sequence of events can be used to describe different stages of customer activity. For example, in a particular time period, a customer may go through a sequence of events such as: subscription greater than billing greater than payment greater than promotion greater than price plan change greater than service call greater than cancellation. Each event can involve a plurality of different business processes or operations that reflect a lifecycle of a customer, for example. Many other types of activities can be related to an identity in various embodiments according to the present invention. The computer program product can include code for defining one or more customer event types in the customer activity components and code for selecting data tables and attributes that will comprise the source of a set of data tables having a particular data schema and attributes.
The computer program product can include code for determining one or more attributes based on data types in source tables and primary and foreign keys. One or more databases can be created from the schema. The database can be a customer data warehouse, for example. The computer program product can also include code for creating data movement mapping rules, and the like. Such mapping rules can provide information about translation of information in tables and attributes of data sources to the data warehouse.
In an embodiment according to the present invention, the computer program product can include code for providing users the capability to define their own application-specific entities in customer activity components. In some embodiments, users can choose from among a plurality of pre-defined attributes, as well as defining their own attributes. Many embodiments according to the present invention can include code for providing the capability to automatically derive data types. Embodiments can also include code for providing options to translate data from one data type to another data type. Some embodiments also include code for providing the capability to users to change the automatically derived data types if they so choose. Embodiments can also include code for providing analysis functions of database contents, such as market basket analysis for customer buying behavior, customer valuation analysis, customer segmentation, and the like.
Numerous benefits are achieved by way of the present invention over conventional techniques. The present invention can provide techniques for providing data models that can be customized to fit different business needs, but are able to support reusable application code. Yet further, some embodiments using the techniques and data models according to the present invention can be used to solve customer data analysis problems. Many embodiments can provide the ability to users to customize their data models, while providing a set of generic and reusable customer data analysis functions. Many embodiments enable business applications to be built more easily and quickly than heretofore known methods. These and other benefits are described throughout the present specification. A further understanding of the nature and advantages of the invention herein may be realized by reference to the remaining portions of the specification and the attached drawings.