Digital Ecosystems and Data Creation. A growing number of interactions between individuals and organizations and among organizations (e.g. transactions, exchanges, communications, business processes, and other actions) are conducted through computing systems and across computing networks. The environment within which each organization conducts these interactions with others (e.g. prospects, customers, intermediate product and service suppliers and other intermediaries across the value chain between the ultimate producer of a good or service and its ultimate consumer) can be thought of as its “digital ecosystem”.
These interactions are person-to-person (i.e. human aided), person-to-system (i.e. partially automated), and system-to-system (i.e. fully automated). They generate resultant data. As digital ecosystems become more pervasive and interconnected, as the number of intermediaries connected to them grow, and as the numbers and types of interactions being conducted across them expand, many organizations have access to growing volumes of data derived from a variety of different sources.
Need for Improved Analysis and Decisioning Systems. This data is potentially useful to organizations that wish to better manage their business operations. Much of this data is largely unused while being stored in systems that are costly to maintain.
Prior Art Data Mining Systems. In this context, the concept of “data mining” has emerged commercially over the past decade as one means for organizations to analyze data in order to extract useful knowledge therefrom. In general, data mining or knowledge discoveries involves the analysis of data structures for novel or unknown relationships that provide information that can be acted upon.
Current data mining systems are based on the use of knowledge discovery algorithms to automatically analyze data to discover relationships, patterns, knowledge or other useful information. These data mining systems (“DM Systems”) currently consist of data mining software tools incorporating data mining algorithms (“DM Tools”), and, more recently, relational database management systems (“RDMS”) incorporating data mining algorithms, or enterprise software applications (“DM Applications”) containing embedded data mining algorithms.
Referring to FIG. 1, there is shown a block diagram illustrating a typical data mining system 100 in accordance with the prior art. Data created by person-to-person, person-to-system, and system to system interactions is created, exchanged and captured through the Internet, by means of autonomous and interconnected wired and wireless networks, which support a wide range of appliances, applications, and interfaces (e.g. browsers, PCs, PDAs, telephones, etc.) which connect individuals and organizations. This data, which is typically stored in a relational database system (“RDBMS System”) 110 or in a data mining application (“DM Application”) 120, is extracted by a proprietary data mining system 130 operated by a skilled analyst. The output of the data mining process engaged in by such analyst, in the form of a predictive model or set of scores relevant to the business objective and data structure being analyzed 140 may be returned to the RDBMS System 110 or DM Application 120 through a batch process as static scores. The RDBMS System 110 typically includes proprietary data mining algorithms and supports the application of these algorithms to data stored in the RDBMS System. Data mining models created by the RDBMS System are stored in the RDBMS as tables 112 and applications 111 can access these results if desired via a query or through an application programming interface (“API”) with additional query formulation or coding. The DM Application 120 typically includes a data mining engine developed or licensed by the developer of the DM Application. Data mining typically occurs on data stored in the application and resultant data mining models or scores are stored in the application for use in accordance with the internal, proprietary logic of the specific application (e.g. to support automated segmentation in a marketing application).
Current innovations in data mining systems have focused on the nature of data mining algorithms, and the incorporation thereof into the DM System. The focus has been on the process of sourcing data for and creating data mining models, and the development of aids for the user to assess the results of or process of validating a data mining model. For example, in U.S. Pat. No. 6,112,194 (Bigus), discloses a feedback mechanism for monitoring performance of data mining tasks. The feedback mechanism consists of a user selected mining technique type for a data mining operation and an associated quality measure type. A quality indicator, calculated from the mining technique type and the quality measure type, is displayed to the user during the mining operation to help the user decide whether to stop the data mining operation and reconfigure the operation. Other patents related to data mining include U.S. Pat. No. 6,226,648 (Appleman, et al.); U.S. Pat. No. 6,219,775 (Wade, et al.); U.S. Pat. No. 6,216,134 (Heckerman, et al.); U.S. Pat. No. 6,208,989 (Dockter, et al.); U.S. Pat. No. 6,205,472 (Gilmour); U.S. Pat. No. 6,192,356 (Eyles); U.S. Pat. No. 6,108,004 (Medl); U.S. Pat. No. 6,081,788 (Appleman, et al.); U.S. Pat. No. 6,055,510 (Henrick, et al.); U.S. Pat. No. 5,875,285 (Chang); and, U.S. Pat. No. 5,787,425 (Bigus). All of these patents describe various data mining techniques and algorithms and improvements thereto.
DM Tools (e.g. SAS Enterprise Miner™ and ANGOSS KnowledgeSTUDIO™) are software applications installed on a dedicated computer system for use by skilled analysts to perform data mining tasks (i.e. data exploration or knowledge discovery and predictive modeling). These DM tools enable users to import, analyze and model data sourced from a variety of files, database tables, or other data structures. RDBMS Systems (e.g. Microsoft SQL Server 2000™ and Oracle 9i™) incorporate data mining algorithms into the relational database environment, enabling the mining discovery of or relationships between data stored in tables contained in the relational database.
Thus a limitation of these systems is that rather then providing analyst interfaces, they typically provide an application programming interface that enables third parties to design and develop applications and interfaces integrated with these RDBMS Systems. DM Applications provide user interfaces and functionality enabling the application of data mining algorithms to data captured or accessible to the DM Application in a specific business domain. The data mining capabilities of DM Applications may be developed by the application provider (e.g. E.Piphany™, Blue Martini™, etc.), or through integration of the enterprise software application with a data mining engine incorporated into an RDBMS System, or using an integrated data mining engine that provides an application programming interface for this purpose (e.g. ANGOSS KnowledgeSERVER™ and KnowledgeSTUDIO™ SDK).
In summary, the focus of existing DM systems is to support the application of data mining algorithms to data structures to promote the development and validation of a data mining model. These DM systems are written in a proprietary format in a project centric fashion, for use within a proprietary application environment of the DM vendor, for deployment into the organization's production environment as a “score” through a batch process to a mainframe or RDBMS System.
As a result, the traditional approach to data mining has been project oriented and is designed for use by a relatively small group of skilled analysts. Their activities include defining a business objective; sourcing relevant data from different operational data sources for analysis; creating a data mining model based on this analysis; displaying the data mining model in some limited fashion in the form of a report (i.e. a visualization) or the creation of scores (i.e. a numeric representation) placed in a relational database or mainframe computer; and, measuring the validity and usefulness of the data mining model by reference to the defined business objective.
Prior art systems are deficient because of inefficiencies associated with the data mining process and the limited accessibility and usefulness of data mining models once produced. Data mining models are costly to create, requiring skilled personnel to operate and expensive systems to maintain, monitor and optimize; and, difficult to integrate with the systems and applications comprising the organization's digital ecosystem where access to such data mining models might be usefully deployed. Even when a data mining model (e.g. a customer profitability model, a transaction fraud model, or a next product suggestion model) has been created, it has proven to be extremely difficult to integrate the data mining model with other systems in the enterprise, where this knowledge might be more usefully applied, such as other applications, or applications operating in geographically remote locations or on differenting computing platforms.
Accordingly, it is difficult to create and move these data mining models (i.e. scores, rules, and rules systems) to heterogeneous operational environments or to a variety of different applications and business processes at the enterprise level. Similarly, even where a data mining score, rule, or rules system is moved to a production environment, it is difficult to fully implement data mining models which rely on a variety of different data inputs from disparate data sources or to assess the performance of the data mining score, rule, or rules system with existing systems in the operational environment. Furthermore, it is difficult to integrate the results of a multitude of data mining models with other operational systems used to guide decision making. For example, it is difficult to integrate the results of a profitability model with credit limit approvals, new product offers or other similar business strategies and business processes relating to customers based on their profitability.
A need therefore exists for the effective deployment, management and optimization of a multitude of data mining models in a plurality of computing environments, and for the association of these data mining models with business strategies and business processes. Consequently, it is an object of the present invention to obviate or mitigate at least some of the above mentioned disadvantages.