The Internet is widely used for various purposes, such as communicating information, conducting business, and personal entertainment. Information regarding Internet usage may thus be valuable, as it indicates the frequency and duration that a particular website is used or a user's preference among various websites. In some cases, such as in a marketing study, Internet usage analysis may be performed to understand customers' online preferences. For example, the on-line user's activities may be monitored and used as a basis for targeting advertisements to the user. In some other cases, the human resource department of a corporation may utilize Internet usage analysis to monitor employees' Internet activities.
Among the several techniques used to analyze user activities on the Internet, proxy log analysis has been recently developed and widely used. For example, proxy log analysis may be used to determine how often an employee “surfs” the Internet during his work hours. Proxy logs may be created by various tools on a proxy server, such as a Microsoft Proxy Server 2.0 or a Microsoft ISA Server. Proxy logs record information about Internet events (known as “hits”), such as the URL of a hit, and the time the URL is downloaded. Based on such proxy logs, Internet usage time and usage frequency could be determined.
One system and method for monitoring individual Internet usage is described in U.S. Pat. No. 6,606,657 to Zilberstein (“the '657 patent”). The '657 patent is directed to gathering and disseminating detailed information regarding web site visitation. The system and method described in the '657 patent may obtain information regarding the sites that have been visited, the duration and times of such visits, the most popular sites, the most popular jump sites from a particular web page, etc. In particular, the '657 patent discloses a method for monitoring Internet usage by a user at a terminal, where the method includes the steps of, among other steps, detecting access by the user to a new website, determining a first time interval between the detected access to the new website and a detected access to a previous website, and determining a second time interval which indicates a period of time during which the user actively accessed the previous website. Information related to the access to both the new website and the previous website may be obtained from a proxy server.
Although the method described in the '657 patent may be effective for monitoring Internet usage via proxy log analysis, it may nevertheless be problematic. For example, many websites have automated activities, such as automated webpage refresh, updates, and pop-ups, and these automated activities may occur even when a user is away from the computer. When such automated activities occur, they are also recorded by the proxy logs. As a result, conventional sever log analysis, such as the one described in the '657 patent, may incorrectly include the automated web activities as active user interactions with the Internet. Furthermore, since automated activities may be misidentified as an access to a new website, or an active access to the previous website, the first and second time intervals determined based thereon may not be accurate. In addition, since the method described in the '657 patent monitors Internet usages on the server side, and thus may be incapable of analyzing Internet usage of a particular terminal user.
Therefore, there is a need to determine and indicate the existence of automated activities during a proxy log analysis. The disclosed system and method for analyzing Internet usage are directed towards overcoming one or more of the shortcomings set forth above.