More data is being received, processed, analyzed, and stored than ever before. This is because businesses recognize the importance of this data for use in analyzing consumer spending behaviors, trends, and other information patterns which allow for increased sales, customer profiling, better service, risk analysis, and so on. However, due to the enormity of the information, mechanisms such as data mining have been devised that extract and analyze subsets of data from different perspectives in attempt to summarize the data into useful information.
One function of data mining is the creation of a model. Models can be descriptive, in that they help in understanding underlying processes or behavior, and predictive, for predicting an unforeseen value from other known values. Using a combination of machine learning, statistical analysis, modeling techniques and database technology, data mining finds patterns and subtle relationships in data and infers rules that allow the prediction of future results.
The process of data mining generally consists of the initial exploration, model building or pattern identification and deployment (the application of the model to new data in order to generate predictions). Exploration can start with data preparation which may involve cleaning data, data transformations, selecting subsets of records. Model building and validation can involve considering various models and choosing the best one based on their predictive performance, for example. This can involve an elaborate process of competitive evaluation of the models to find the best performer. Deployment involves applying the selected model to new data in order to generate predictions or estimates of the expected outcome.
When data mining is employed to analyze a dataset, the result is usually a set of patterns that describe the dataset. A set of such patterns can be stored in a mining model. Traditional data mining algorithms can detect such data patterns in datasets. However, datasets are not rigid, and change over time, thereby causing the data patterns to shift with these underlying changes.