The proliferation of malicious and/or inappropriate information/contents on the Internet has necessitated individuals and organizations to adopt protection techniques and systems to protect their users. For example, many individuals and organizations (collectively “users” herein) have long employed software and/or hardware to intercept and inhibit unauthorized attempts to access or download malicious and/or inappropriate information/contents (“unwanted contents” herein).
A common approach to detect attempts to access/download unwanted contents is URL filtering. In URL filtering, a rating server receives URLs from a user's client device (such as a browser or some other internet or network accessing device) and provides rating information regarding the URL. The rating information reflects attributes of the website and/or webpage represented by the URL. Any attribute is possible and attributes include, by way of example and without limitation, news, entertainment, adult, comics, sports, highly rated, poorly rated, newly updated, English, Spanish, etc. The received attributes may be subsequently employed to decide, based on some access policy, whether the user is permitted to access the webpage or website associated with the URL.
To facilitate discussion, FIG. 1 shows a simplified block diagram of a typical prior art URL rating scheme. As shown in FIG. 1, a client 102 sends a URL string “abcwidget.com” (104) to a gateway device 106 to request access to a website 108 via the internet 110.
Gateway device 106 is shown implementing URL filtering 112. To decide whether the user associated with client 102 is allowed to access website 108 of abcwidget.com, URL filtering 112 may send a rating request (118) for the user-input URL (i.e., “abcwidget.com”) to a URL rating server 114 through the Internet 110. In response, URL rating server 114 sends a rating (120) back to URL filtering 112 to enable URL filtering 112 to perform its filtering function based on some predefined access policy.
If the access policy permits the user of client 102 to access websites and/or web pages having the rating 120, the user can then website 108 (as shown by reference arrow 124. On the other hand, if the access policy does not permit the user of client 102 to access websites and/or web pages having the rating 120, the request to access website 108 is denied.
Given the popularity of Internet usage, if a rating request needs to be serviced by URL rating server 114 each time client 102 wishes to access a webpage, the number of rating requests serviced on a given day may be quite large, necessitating large communication and processing bandwidth on the part of URL rating server 114 (or multiple servers, as may be the case) and/or imposing a large bandwidth load on the network due to the volume of rating requests and rating responses.
In order to reduce the number of rating requests sent through the network and processed by URL rating server 114, gateway device 106 may implement a local URL rating cache 116 for temporarily storing URLs (or their hash values) of previously accessed web pages, along with their corresponding ratings as received from URL rating server 114. Thus, if a web page has been rated once by URL rating server 104, a subsequent access request by client 102 to that same web page would result in a local cache hit, negating the need to send the URL to URL rating server 104 again to obtain a rating.
In the following discussion, the local URL rating cache (herein “the local cache”) is assumed to be implemented in a server that is local to the client device. It should be apparent to those skilled in the art, however, that some embodiments of the invention discussed herein also apply to situations wherein the local URL filtering functionality (including obtaining a rating for a URL), and/or the local URL rating cache is implemented in any device that is communicably coupled with the user's client device (such as a router, a gateway device, etc.) or in the user's client device itself.
To further the discussion, as time goes by, the local URL rating cache 116 tends to be filled with URLs (and associated ratings) of websites and web pages that a user has visited. Since most users tend to revisit a limited set of websites over and over, the rate of cache hit tends to be quite high for URL rating requests. Thus, after some time, the URL rating response time is greatly improved since many, if not most, of the URLs entered by the user will be serviced by the URL rating local cache 116 instead of requiring a rating response from the remote URL rating server 114.
However, the improvement in the URL rating response time only comes after the user has visited that webpage at least once before, and the URL rating for that webpage has been performed at least once, which result in the URL rating data for that webpage to be stored in the local URL rating cache. To elaborate, after the URL rating has been performed at least once using the remote URL rating server (such as server 114 in FIG. 1), the URL and the rating response are then cached locally to service any subsequent rating request for the same URL. To put it differently, until the user has entered the URL at least once to trigger the URL rating request, the URL rating local cache 116 does not have URL rating data for that URL. As a consequence, the user typically experiences some performance degradation due to local URL rating cache miss whenever a webpage is visited for the first time and its URL is entered for the first time.
Improving the average system response time to a request to access a web page, even for web pages that the user has not previously visited, is a subject of embodiments of the present invention.