The global Internet has become a mass media on par with radio and television. And just like radio and television content, the content on the Internet is largely supported by advertising dollars. The main advertising supported portion of the Internet is the “World Wide Web” that displays HyperText Mark-Up Language (HTML) documents distributed using the HyperText Transport Protocol (HTTP).
As with any advertising-supported business model, there needs to be reliable systems for collecting information on what the web viewers are interested in viewing and how much each item is viewed. Radio and television advertising use ratings services that assess how many people are listening to a particular radio program or watching a particular television program in order to analyze viewer interest in the various programs. With the World Wide Web portion of the Internet, web site publishers have the luxury of being able to collect detailed web viewer information since each and access to a web page access requires the web site server to receive a request and provide a response. Thus, when any web page request is received by a web site server, a web site usage accounting system on that server can count the web page viewing and store information about the web page request.
Since every single web page view can be counted, the web site usage accounting system creates an enormous volume of valuable web viewer information. In order to effectively analyze this enormous volume of web viewer information, data analysis tools are required. Thus, it would be very desirable to create tools that efficiently segment the enormous volume of web viewer information into smaller groups and process the information.