1. Technical Field
The invention relates generally to transactions data mining. More particularly, the invention relates to a method for leveraging existing transactional data with genetic algorithm methodology to create highly predictive sets of variables spanning all of the available data.
2. Background
A business decision must be made that requires an estimate of some future event. The solution offered today is simplified to the process of brainstorming and programming.
Thus, group of analysts gathers to brainstorm for variables and operators. A potential list of variables is made and programming specification are written. Variable generation code follows with the usual development cycle. The code is audited and the full analytic dataset is created. Analysis may reveal new variables to add, which triggers an analytic dataset that is recreated until the project schedule expires, or the analyst/programmers expire or resources are exceeded.
The downside or cost of brainstorming and programming in this fashion is that it is a very labor-intensive process. The analytic datasets are limited generally to about 600 variables. Analysts and programmers run into fatigue, project schedules slip, and more is too many variables to handle, let alone analyze. This limits the investigation possible. Other problems result. Knowledge transfer across projects depends on brainstorm success which is not attained, results do not appear data driven, and the questions and more issues ensue.
The business problem to be solved can be something having to do with cost issues such as: 1) chances of bankcard transaction fraud; 2) chance that this Internet purchase is going to be fraudulent; 3) chances that this Web site visitor is going to buy something; 4) chances that this consumer defaults on a new loan for which they are applying; and 5) chances that this consumer is going to accept this offer. The ability to hedge on these kinds of issues informs the decision-maker.
A business decision estimate can be improved by mining transaction data for predictor trends in the target area. What is needed are better and faster ways of using transaction data to inform decision makers by leveraging the available transaction data with less labor intensive methods, i.e. methods that can span the full datastore, not just the data humanly foreseeable to be the most valuable from the onset.