Whilst the Internet represents a huge source of valuable information, much of the content that is available online is inappropriate or malicious, or indeed illegal. By its very nature, it is not always easy to track and eliminate such content. Law enforcement agencies find it particularly challenging to identify and remove illegal content. Such content may be malicious, for example malware capable of stealing individual's bank details, or pornographic. The best way for organisations and individuals to protect themselves against inappropriate content is to install onto their computers and/or servers security software which filters outgoing content requests and/or downloaded data to remove unsuitable content.
Once such product is the Internet Security product provided by F-Secure® Corporation of Helsinki, Finland. As well as scanning downloaded data for the presence of malware, this product is able to identify potentially unsafe content following a user or client computer initiated request to download content from the Internet. This identification may occur either prior to a web page being downloaded or prior to it being displayed or otherwise processed on the client computer. The approach relies on the maintenance at a central rating server of a website rating database. For each web page, as represented by a Uniform Resource Locator (URL), the database holds a rating indicating the nature of and threat posed by the web page. A rating indicates, for example, whether the content within a web page is suitable for children, is suitable for children but under adult supervision, or is completely inappropriate for children. The rating may also indicate whether the web page is known or likely to contain malware.
Whenever a web browser (or other application capable of accessing content at a website, for example an email client) sends a request to obtain content from a website, or perhaps sends a DNS look-up request to obtain an IP address for a URL, the request is intercepted by a security application (e.g. a browser plugin), and the URL associated with the request is simultaneously sent to the rating server where the rating database is maintained. The rating server obtains the rating information for the web page (URL) in question, and returns this to the security application at the user's computer. The security application buffers any content received from the website associated with the request until such time as the rating has been received from the rating server. Typically this rating is received prior to any significant amount of data being downloaded from the website such that the downloading and displaying of content is not substantially delayed. Once the rating information is obtained from the rating server, depending upon the nature of the rating information, the security application may block (further) downloading of the content or processing (e.g. display) of already downloaded content and provide a warning to the user depending. In some cases, the security application may compare the rating information received against parental control settings maintained on the user's computer for the current user. Downloading and processing of content is only blocked if these settings are such that they restrict access to content of the type identified by the received rating information. Of course, if a rating indicates that content may contain malware such as a computer virus, downloading and processing is blocked regardless of the parental control settings.
The rating-based approach described above works well for websites having relatively static content, or at least content which does not change greatly in terms of its nature over time. However, the dynamic nature of many websites represents a potential problem when attempting to identify and categorise web content. Due to limited resources, the providers of Internet security services are unable to access (e.g. using web spidering techniques) and re-rate websites on a regular basis. Changes in the content available at a particular website can remain undetected for several months or even longer. Consider for example a registered domain main which, when accessed, presents to a user a “parking” web page, that is a web page merely indicating that the domain name is registered. When the website is checked and rated by a security service provider, the web page will be rated as benign and suitable for accessing by children. However, the owner of the domain name may subsequently introduce inappropriate or malicious content onto the web page, and the change in nature of the content will not be reflected in the benign rating given to it by the security service provider. Of course, when a user seeks to download the modified web page, the pre-downloading check performed by the security application installed on the client computer will merely indicate that the nature of the content is benign, such that the content will be downloaded and the user and client computer exposed to the changed and inappropriate content. This is not only dangerous from the point of view of the user, but will also reduce the user's trust in the security service. Furthermore, it will result in an increased level of enquiries being directed to the security service provider, increasing the service provider's maintenance costs.