Data mining is a technique by which hidden patterns may be found in a group of data. True data mining doesn't just change the presentation of data, but actually discovers previously unknown relationships among the data. Data mining is typically implemented as software in or in association with database management systems. Data mining includes several major steps. First, data mining models are generated based on one or more data analysis algorithms. Initially, the models are “untrained”, but are “trained” by processing training data and generating information that defines the model. The generated information is then deployed for use in data mining, for example, by providing predictions of future behavior based on specific past behavior.
One application for data mining is in the analysis of data collected by companies and other organizations. These entities are amassing huge databases for a multitude of purposes including accounting, billing, profiling of customer activities and relations, manufacturing operations, web-site activity, and marketing efforts. To enhance corporate competitiveness, interest has focused on the creation of data-warehouses and the extraction of information from these warehouses. Purposes for this information include targeting marketing promotions in a cost-effective manner, improving the relevance of a web page to a visiting customer, displaying web-advertisements appropriate to the profile of a visiting customer, detecting fraudulent behavior, enhancing customer service, and streamlining operations.
Data mining typically involves long running computational processes to perform mining operations such as building a mining model, applying the model to compute score and probability, testing and evaluation of the built model, cross-validation of a model, lift computation, etc. Conventional data mining systems provide synchronous execution of data mining tasks. In synchronous execution, a data mining task cannot be gracefully interrupted during execution, for example, to determine the status of the task. Results are only available upon completion or termination of the task, which for a data mining task can be quite a long time. This can cause problems to arise. For example, if a client process that initiated a data mining task should crash or terminate execution, synchronous execution means that a user cannot obtain the status of the data mining task to determine what the error was in a timely manner. A need arises for a data mining system that provides improved functionality over synchronous data mining systems, and which provides features such as interruptible tasks and status output.