Commerce over networks, particularly e-commerce over the Internet, has increased significantly over the past few years. Part of e-commerce enables users/customers to access information of products and to purchase them from various commercial Web sites (i.e. online stores). There are numerous online stores currently operating in the Internet including: Amazon.com, eToys.com, Buy.com, Wal-Mart.com, LLBean.com, and Macys.com. These online stores provide various customer services to make commerce activities possible over Web sites. Some of the examples of the basic services are catalogs of merchandise which are both browsable and searchable by various product attributes (e.g., keyword, name, manufacturer, and model number), shopping carts, and checkout process. Some online stores also provide advanced customer services such as wish lists, gift registries, calendars, custom-configuration of products, buyer's groups, chatting, e-mail notification, product evaluations, product recommendations and in-context sales.
As shopping experience in the Internet gets deeper and broader, it becomes an important task for merchants of online stores to understand and analyze the shopping behavior of customers and to improve the shopping experience in their online stores by using this analysis. A basic unit for such analysis is clickstream data from online stores. Clickstream is a generic term to describe visitors' paths through one or more Web sites. A series of Web pages requested by a visitor in a single visit is referred to as a session. Clickstream data in an online store is a collection of sessions on the site. Clickstream data can be derived from raw page requests (referred to as hits) and their associated information (such as timestamp, IP address, URL, status, number of transferred bytes, referrer, user agent, and, sometimes, cookie data) recorded in Web server log files. Analysis of clickstreams shows how a Web site is navigated and used by its visitors.
In an e-commerce environment, clickstreams in online stores provide information essential to understanding the effectiveness of marketing and merchandising efforts, such as how customers find the store, what products they see, and what products they buy. (While not all this information may be available from Web server log files, it can be extracted from associated data sources such as commerce server databases and tied together with HTTP request data.) Analyzing such information embedded in clickstream data is critical to improve the effectiveness of Web marketing and merchandising in online stores. Interest in interpreting Web usage data in Web server log files has spawned an active market for Web log analysis tools that analyze, summarize, and visualize Web usage patterns.