The need for security against malicious material accessible over the Internet is well appreciated. Such material can be placed into two broad categories: adware and malware. Adware exposes an Internet user to the unsolicited advertisement of goods or services. Malware is more sinister in nature and enacts fraud, or causes other types of damage to the Internet user. It is also desirable to protect Internet users against undesirable material such as adult websites.
A malicious website will typically attempt to deceive the visitor into believing that the website is safe. For example, the website might be designed to imitate a legitimate banking website, tricking a visitor into entering sensitive information directly into the website. A malicious website might alternatively or additionally contain links that, when “clicked” on, download spyware or other types of virus onto the user's computer. Spyware can be used to deliver a user's personal information to an attacker.
In order to be successful, an operator of a malicious website must create traffic to the website. This may be done, for example, by placing hyperlinks to the malicious website in a seemingly legitimate context so as to trick the user into believing the link is safe. Such a context might for example be a “phishing” email that looks as if it has originated from a legitimate organisation, such as a bank, the email requesting the recipient to click on a link (leading to the malicious site) in the email. However, the general public are now more alert to the dangers of phishing attacks and are less likely to succumb to them. Anti-virus applications have also developed sophisticated techniques for dealing with phishing attacks, and are more widely implemented. As such, the owners of malicious websites are seeking new ways of driving traffic to their websites.
The growth of websites using the so-called second generation website platform (Web 2.0) provides an opportunity for malicious website operators. A common feature of Web 2.0 websites is that the content of the website, in contrast to conventional websites, is not created by the website operator but by the website users. Technologies that fit into the Web 2.0 category include web blogs, social bookmarking, wikis, podcasts, RSS feeds which automatically feed content from an external website to the target website, and social networking sites. Web 2.0 websites are some of the most popular websites on the World Wide Web. For example, Web 2.0 websites like Wikipedia™, Facebook™, YouTube™ and del.icio.us™ have millions of visitors each day. There is also an enormous amount of content on these websites that changes on a daily basis. Due to the interconnected nature of the content on these websites, complex networks of links to different external websites can be embedded within them.
The content of Web 2.0 based websites can be created in various ways. For example, a website user might have an account with the Web 2.0 website through which he can upload content onto a section of the website, with the section having it's own unique web address (URL). The access rights to the content on the website can vary from website to website. Certain websites allow a user to place access restrictions to the user's section, whereas other websites have no access restrictions so that the content is automatically available for anyone to view. Once the content has been made available on the website, it is said to have been “published”. Web 2.0 websites may allow visitors to particular pages to add comments to the pages. Hence further content can be added to a user's section by people who do not necessarily have an “account” with the website. A website may also host content generated by RSS technology, where the content is generated externally to the website and automatically displays within a webpage. RSS generated content may change in realtime.
The reason that malicious parties can use these Web 2.0 websites (and similar websites that host similarly unregulated content) to their advantage is that the public typically views these websites as trustworthy. The public may not be alert to the fact that the content is not created by the website operator itself, or may assume that the operator has somehow regulated third party content. Hence it is possible for malicious parties to upload content that contains links to malicious websites or to adware and to hide behind the goodwill of the website operator.
A means to protect users from the dangers of malicious links embedded within seemingly innocent Web 2.0 content is for the website operator to implement bespoke security measures at the server(s) on which the website is hosted. For example, a server could scan uploaded content for blacklisted URLs. If a URL is identified as malicious the link to the URL could be removed from the content before it is published, or the entire submission could be rejected. This approach relies of course on the blacklist of malicious URLs being up-to-date.
An anti-virus application vendor may be an appropriate channel for providing the detection software and for providing blacklist updates to website operators. However, as different servers and website operators often use different server programming tools [such as PHP, Common Gateway Interface (CGI), Active Server Pages (ASP), and Server Side Includes (SSI)], and the tools rely upon different programming languages [such as PERL, Python, Ruby, and C++], the detection software would have to be customised for each server. In any case, Web 2.0 websites can experience enormous amounts of Internet traffic and so it would be very difficult to have the resources at the servers to scan content in a dynamic manner.