This invention relates to detecting trends in computer network traffic.
Caching is a technique that is used for various purposes in many areas of computing. Typically, cached data is a copy of data that has already been computed or retrieved and that is likely to be reused at some later point in time, and where re-computing the original data or fetching it from its location is more expensive (usually in terms of access time and computer resources) than reading the copy of the data from the cache.
One field in which caching is commonly used in order to save computer resources and reduce latency is in search engines, such as the Google™ search engine by Google Inc., Mountain View, Calif. By caching search results, it is possible to conduct an original search for a new query, and whenever the same query is made by another user, serve a cached copy of the original search results to that user. This can lead to significant savings in computer resource usage, especially when large numbers of queries are processed on a regular basis.
When large numbers of distinct queries are possible, it can be difficult or impractical to cache all the search results. Thus, it is necessary to make a decision about which search results to cache. In one commonly used method, known as the LRU (least recently used) method, the N last distinct events are cached and any older events are discarded, where N is a number that depends on the type of application and on the available cache space. The term “event” is used herein to refer to any type of activity that can be measured and logged on a computer network. Some examples of events include queries, user selections of hyperlinks, and user selections of advertisements. In another commonly used method, known as a top-value scheme, events are cached based on some kind of prior knowledge. For example, the top N queries seen over a set of days can be cached, hoping that these events will remain the top events in the live traffic.
As the skilled reader realizes, neither of these methods is sufficiently perceptive to short-term trends under certain conditions. Short-term trends are, however, important to consider, as they are often the result of external activities dominating the time of day and date, as well as current events. For example, during the days preceding and following a space shuttle launch there may be many searches relating to “space shuttles,” “NASA,” “space,” and similar terms. Right around the Martin Luther King Holiday, there may be many searches about “Martin Luther King.” If a celebrity was just arrested for drunk driving and assaulting a police officer, it is reasonable to expect a significant increase in queries involving the name of that celebrity. Thus, it would be useful to have better methods of detecting short term trends for the purposes of caching search results to making them more readily available to users.