1. Technical Field
The present invention relates to an improved data processing system. In particular, the present invention relates to a method and system for predicting customer behavior based on data network geography.
2. Description of Related Art
Currently, when using artificial intelligence algorithms to discover patterns in behavior exhibited by customers, it is necessary to create training data sets where a predicted outcome is known as well as testing data sets where the predicted outcome is known to be able to validate the accuracy of a predictive algorithm. The predictive algorithm, for example, may be designed to predict a customer's propensity to respond to an offer or his propensity to buy a product.
The data used to train and test the algorithm are selected using a random selection procedure, such as selecting data based upon a random number generator, or by some other means to insure that both the training data and test data sets are representative of the entire data population being evaluated. Tests of randomness of each of the attributes, e.g., the demographic information of the individuals, in the data sets can then be completed to see if they represent a randomly selected population.
While the above approach to selecting testing and training data sets may be suited for some applications, the purchasing behavior of customers is not only based on demographic and cyclographic information. Ease of access to various goods and services may also influence the customer's ultimate purchase patterns. That is, if a customer is able to obtain access to the goods and services more easily, the customer is typically more likely to engage in the purchase of such goods and services.
Today, customers are purchasing more and more goods and services over data networks, such as the Internet. In doing so, customers must often navigate a morass of web sites and web pages to ultimately arrive at the goods and services that they wish to purchase. This web sites and web pages that make up the data network are collectively referred to as the data network geography.
Many times, a customer may become frustrated during this navigating of the data network geography and may abandon the endeavor. Other times, the customer may simply purchase goods and services from the first web site or web page that they locate that provides the goods and services without bothering to look at other web sites that may offer the same goods and services under different terms, such as pricing, incentives, and the like.
Such influences on customer behavior by the data network geography are not taken into consideration when training and using predictive algorithms to predict customer behavior. Thus, bias may be introduced into either the test data, train data, or both data sets making either or both nonrepresentative of the overall customer database.
Therefore, it would be beneficial to have a method and system for correlating a customer's effort in navigating a data network with the customer's purchase behavior. It would further be beneficial to have a method and system for predicting a customer's behavior based on the geography of the data network. Furthermore, it would be beneficial to have a method and system for evaluating the training of a predictive algorithm to determine if the training and testing data sets do not adequately take into consideration the influences of the data network geography on customer behavior.