Computers and computer related technology have evolved significantly over the past several decades to the point where vast amounts of computer readable data is being created and stored daily. Digital computers were initially simply very large calculators designed to aid performance of scientific calculations. Only many years later had computers evolved to a point where they were able to execute stored programs. Subsequent rapid emergence of computing power produced personal computers that were able to facilitate document production and printing, bookkeeping as well as business forecasting, among other things. Constant improvement of processing power coupled with significant advances in computer memory and/or storage devices (as well as expediential reduction in cost) have led to persistence and processing of an enormous volume of data, which continues today. For example, data warehouses are now widespread technologies employed to support business decisions over terabytes of data.
Unfortunately, today, data warehouses are maintained separately within relational databases and are most often directed to application specific environments controlled by a variety of application service providers. A relational database refers to a data storage mechanism that employs a relational model in order to interrelate data. These relationships are defined by a set of tuples that all have a common attribute. The tuples are most often represented in a two-dimensional table, or group of tables, organized in rows and columns.
The sheer volume of collected data in databases (e.g., relational databases) made it nearly impossible for a human being alone to perform any meaningful analysis, as was done in the past. This predicament led to the development of data mining and associated tools. Data mining relates to a process of exploring large quantities of data in order to discover meaningful information about the data that is generally in the form of relationships, patterns and rules. In this process, various forms of analysis can be employed to discern such patterns and rules in historical data for a given application or business scenario. Such information can then be stored as an abstract mathematical model of the historical data, referred to as a data-mining model (DMM). After the DMM is created, new data can be examined with respect to the model to determine if the data fits a desired pattern or rule.
Conventionally, data mining is employed upon data in a closed environment, frequently by large corporations, for example, to understand complex business processes. This can be achieved through discovery of relationships or patterns in data relating to past behavior of a business process. Such patterns can be utilized to improve the performance of a process by exploiting favorable and avoiding problematic patterns.