Data mining is a technique by which hidden patterns may be found in a group of data. True data mining doesn't just change the presentation of data, but actually discovers previously unknown relationships among the data. Data mining is typically implemented as software in or in association with database management systems. Data mining includes several major steps. First, data mining models are generated based on one or more data analysis algorithms. Initially, the models are “untrained”, but are “trained” by processing training data and generating information that defines the model. The generated information is then deployed for use in data mining, for example, by providing predictions of future behavior based on specific past behavior.
One application for data mining is in the analysis of data collected by companies and other organizations. These entities are amassing huge databases for a multitude of purposes including accounting, billing, profiling of customer activities and relations, manufacturing operations, web-site activity, and marketing efforts. To enhance corporate competitiveness, interest has focused on the creation of data-warehouses and the extraction of information from these warehouses. Purposes for this information include targeting marketing promotions in a cost-effective manner, improving the relevance of a web page to a visiting customer, displaying web-advertisements appropriate to the profile of a visiting customer, detecting fraudulent behavior, enhancing customer service, and streamlining operations.
One important aspect of data mining is the building of models of adequate quality, that is, models that accurately represent the data to be mined and that provide predictions of adequate quality. In conventional data mining systems, the building of quality models is a hit or miss process. A user who wishes to build a data mining model must specify a number of parameters that control the model building process and hope that those parameters will lead to a model having acceptable quality. Although conventional data mining systems may provide some tools to assess the quality of a model, the user must make the necessary adjustments in an attempt to improve the model quality, if needed. A need arises for a data mining system that provides the capability to automatically generate data mining models of adequate or even optimum quality in a way that reduces the need for user interaction and reduces the cost and improves the quality of model building.