The prior art addresses a number of data mining problems such as classification, association discovery, sequential patterns, outlier detection, time series forecasting, and clustering. Data mining techniques have been applied to both market basket and web data. Application of data mining techniques to the web, i.e., web data mining, has followed three main directions: web content mining, web structure mining, and web usage mining. Generally, web usage mining is the process of applying data mining techniques to the discovery of usage patterns from web data. However, present web usage mining research does not address mining the retention behavior among a sequence of pages or sites.
For example, conventional data mining techniques fail to answer various questions regarding web usage mining or funneling: What percentages of hits on a web network home page are followed by hits on a specific web service site? What percentage of these hits is followed by hits on a specific web service site and then followed by hits on another specific web service site? What are the most interesting clickpath funnels starting with these hits? Where does the greatest drop off rate occur after a user has hit the web network home page?
In addition, presently available data mining techniques are unable to define any measure of “interestingness” with regard to funnels. Instead of relaying specific funnel points (pages) of interest at each step, a business manager may want to know all “interesting” funnels starting with given funnel points. For example, for users who access a portal home page, what is the most common behavior after reading the page? Where do users begin to leave the site? When do users abandon the network? These all translate into the questions: What are the widest funnels? What are the narrowest funnels? What are the funnel points? In these cases, when attempting to determine funnel drop off rates, funnel points are not provided. Instead, conventional data mining cannot provide an analysis yielding the most interesting funnels.
For these reasons, a system for clickpath funnel analysis is desired to address one or more of these and other disadvantages.