1. Field
This invention relates to methods and systems for collecting, processing, and displaying information related to a web site.
2. Description of the Related Art
With an abundance of web sites on the Internet, it is becoming increasingly difficult to safely and efficiently navigate the Internet. In a practice known as ‘spoofing’ or ‘phishing’, malicious web sites will often lure users into visiting their web site under the pretense of offering genuine information or legitimate business. These web sites may appear, for example, in search results or as links in an e-mail. Typically, the user does not know that they have accessed a malicious web site until sometime after visiting the web site. Often, personal information may have already been shared on the malicious web site before the user becomes aware that the web site is malicious. Knowing whether or not a web site can be trusted prior to visiting the web site is a valuable tool in combating these malicious web sites.
Identifying trusted web sites is facilitated by collecting and analyzing user web behavior, or clickstreams, to determine a variety of metrics associated with a web site. By knowing a web site's popularity, historical and present-day, as derived from a clickstream analysis, an indication of trust can be generated for the web site. Other derived metrics are also valuable to the user. For instance, the metrics may include a list of the top ten web sites visited by users after having visited the current web site. The metrics may also include the ranking of the web site with respect to the most visited sites on the Internet.
The derived metrics may also facilitate identifying relevant search results. When a user executes a search, generally, results are displayed in a rank order determined by an algorithm. However, these algorithms do not account for post-search activity. For a given keyword search, for example, search results that have a high volume of clickstream activity may be deemed more relevant than other web sites where user dwell time was minimal. By integrating metrics derived from clickstream analysis with a search function, search results can be optimized to display the most relevant search results first.
The abundance of web sites on the Internet also makes efficiently identifying deals and promotions an arduous task. Some promotions may be obscure, some deals may be outdated, and others may simply not be well-advertised. By querying a data store of deals that can be supplemented by retailers, users, and data store maintainers, a typical set of search results can be annotated with an indication of whether or not a deal is present on a given web site.
Thus, a need exists for a method for alerting users as to malicious web sites before visiting the web site and increasing search efficiency by displaying relevant search results first and applicable deals associated with a given web site.
Effectively analyzing internet activity of a web site may be based on web site log files, cookies, and the like that may collect data that may, or may not, identify an individual visitor uniquely. The information collected may include visits by search engines, bots, spiders, repeat visitors, and the like. Such information, while providing a measure of accesses to the pages of a web site, may not provide useful information about people visiting and engaging various portions of a web site over a period of time, such as a month. Web logs may not be able to collect enough information about an access to the web site to determine if the access was from a unique person, a repeat visitor, a new visitor, a BOT, a spider, and the like.
The raw counts of such logs and the like, to be usefully applied to various perspectives must be put in context such as an estimate of internet traffic. Also, absent similar information from other web sites, it is impossible for a web site owner to determine how his web site fares compared to his competitors, and the like. When this information is privately held by each web site, the likelihood of gaining unrestricted access to a competitor's web site statistics is very small, if not impossible. Therefore, making a wealth of internet activity data available in accurate and timely fashion may be very desirable to web site owner, operators, advertisers, and the like. Determining methods and systems of collecting, structuring, aligning, analyzing, and presenting accurate estimates of internet activity, such as in a form of site metrics is needed.