1. Field of the Invention
This invention relates to a computer-implemented data mining system, and in particular, to a system for analyzing customer transaction data using Factor Analysis/Retail Data Mining Segmentation in a distributed relational data mining system.
2. Description of Related Art
Many computer-implemented systems are used to analyze commercial and financial transaction data. In many instances, such data is analyzed to gain a better understanding of customer behavior by analysis of customer transactions.
Generally, customer transaction data is organized into “baskets” and is stored in two-dimensional data tables comprised of rows and columns, wherein each row comprises one or more transactions and each column is an attribute of the transactions, called observed variables, such as dollar value of each transaction, quantities bought in different departments, transaction time, mode of payment, etc. Companies often use one or more data analysis tools to mine such customer transaction data, in order to identify patterns in the customers' behavior.
Prior art tools for analyzing customer transaction data often involve one or more of the following techniques:
1. Ad hoc querying: This methodology involves the iterative analysis of transaction data by human effort, using querying languages such as SQL.
2. On-line Analytical Processing (OLAP): This methodology involves the application of automated software front-ends that automate the querying of relational databases storing transaction data and the production of reports therefrom.
3. Statistical packages: This methodology requires the sampling of transaction data, the extraction of the data into flat file or other proprietary formats, and the application of general purpose statistical or data mining software packages to the data.
Factor Analysis (FA) provides a technique that can uncover factors underlying customer purchasing behavior through a logically justifiable partitioning of the observed variables. Each factor represents an affinity group, i.e., a group of observed variables (e.g., products, departments, etc.), that account for a significant percentage (e.g. 80%) of a basket's dollar value.
The affinity groups provide data reduction or compression, as the dimensionality of the original customer transaction data is reduced through the substitution of the original numerous observed variables with a smaller set of factors that preserves most of the behavioral patterns present in the original customer transaction data. However, these factors are able to explain most of the customers' purchasing patterns and interrelationships between the original variables.
Each affinity group is used to define a customer destination segment, since most of a basket's dollar value has the affinity group as its destination. An analysis of a customer destination segment may reveal its strategic importance to the retailer. The analysis of the metrics of destination segments (traffic, quantities, dollar value, margins, etc.) may reveal that some of these destination segments generate a significant level of “traffic” that is substantially profitable.
Nonetheless, there remains a need for a computer automated system that would enable analyzing customer transaction data.