The present invention relates to a process and system for integrating information from disparate databases for purposes of predicting consumer purchasing behavior. In particular, the process and system utilizes distinct purchasing patterns to form unique shopping clusters that are common across the databases to be integrated. These shopping clusters are then used to more accurately predict consumer behavior.
Generally, despite the advancement of the Internet which allows for the transfer and processing of great amounts of information, it is still difficult for companies to accumulate, process and analyze the necessary information to accurately predict a consumer's purchasing behavior. Typically, there are two types of information which are used for this purpose, namely personal information and demographic information. Personal information includes the name, address and telephone number of a particular customer, and preferably his or her social security number. Demographic information may contain a customer's county of residence, the income range (e.g., $30,000 to $35,000), the highest level of education achieved (e.g., a college degree), and similar non-personal identifiable consumer information.
The collection of this type of consumer information and the use of it to predict consumer purchasing behavior is important to merchants because it enables merchants to improve the stocking of their inventory, plan better locations for their stores, and more effectively advertise and market their goods and services. The company which is best able to collect and synthesize the highest amount of consumer information will likely be the company which is best able to predict consumer behavior and thus generate the most sales.
Predictably, although merchants today are able to determine much useful information about their own customers, what they cannot readily obtain is information about customers who shop at their competitors' stores and/or other merchants within their business category.
Thus, merchants generally turn to marketing and/or consulting agencies to collect and analyze on the merchant's behalf consumer personal and demographic information from a variety of sources. It becomes extremely important how well such information can be gathered, collated and analyzed so that it can be an accurate predicter of consumer behavior. Presently, companies request and receive demographic information from many vendors and/or even credit issuing agencies, which all have such information stored in their respective databases. There are various methods in existence which attempt to effectively integrate the information received from the disparate databases.
For instance, one of such methods, referred to as the “fusion” method, simply assigns to all those individuals falling within the same “demographic characteristics” with the same “consumer and media behavior” (e.g., likely to purchase Coca-Cola or some other designated product). Using this fusion method, for example, an individual listed in one merchant's database A who is Hispanic, aged 25 and 34, with a high school education, and earning between $30,000 and $35,000, is “matched” with another individual in another merchant's database B who has some or all of these same demographic characteristics. These matched individuals are then assigned to the same “consumer and media behavior.”
Another conventional technique, called a “geo-matching” method, groups all individuals having the same or adjacent geographical location (e.g., a zip code, a census block, etc.) and assigns these individuals the identical “consumer and media behavior.”
Although these techniques are still widely used in other parts of the world, they have become disfavored in the United States due to the discovered weak correlation between the general variables (i.e., the demographic characteristics information) and the actual behavior on the part of the consumer. Thus, the above-described prior art techniques of integrating and utilizing demographic information from two or more disparate databases has provided a very limited success in predicting consumer and media behavior.
Accordingly, there is a need for a way to better utilize consumer purchasing information existing in disparate databases to more accurately predict the purchasing behavior of consumers.