The present invention relates generally to data network content delivery, and more particularly to content filtering using categorized filtering parameters.
Content delivery via data networks is becoming increasingly popular. One such data network, the Internet, has become a popular means for users to access information on various topics of interest. There are numerous content providers disseminating information via the Internet on various subjects. For example, there are content providers providing information on sports, news, finance, science, entertainment, etc. The content providers make this information available to users via websites, and end users access the information using web browsers. So-called “web surfing” of websites using an Internet browser is well known in the art.
In addition to the basic content, many websites also contain extraneous content, often in the form of advertising. From the content provider's perspective, advertising is desirable because it generates revenue for the content provider. Advertisers are willing to pay significant advertising fees to the more popular websites. However, from the end user's perspective, advertising is unnecessary (and often unwanted) information being displayed on the user's computer. In addition to merely being unnecessary or unwanted, advertising content may have deleterious effects on the user's web browsing experience. For example, advertising content often consists of graphics and animation, which wastes the user's bandwidth and may slow down the delivery of the desired content. Also, the complexity of some advertising content requires additional processing by the user's web browser, which delays the display of the webpage at the user's computer.
A webpage may contain various types of advertising. One type is an inline advertisement in which the content provider inserts advertising content into the webpage. Another type of advertising is interstitial advertising, in which an advertisement page is shown before the actual requested content page. A user generally must view the interstitial page for a period of time before the requested content is delivered or displayed. Another type of advertising is called outsourced advertising, in which a webpage has a reference to a third party web server and the user's web browser requests and retrieves the advertisement from the third party web server. Regardless of the type of advertising, users generally may prefer to view content without such extraneous content.
There have been various attempts by users to block advertising from websites. One such attempt is the use of a browser plug-in to filter out advertising. A browser plug-in is additional software that may be installed on a computer that adds functionality to the basic browser. For example, the Firefox web browser has an available plug-in called Adblock. The Adblock plug-in allows a user to specify a set of pattern rules, each of which can either be a literal match along with the wildcard “h”, or can be a full regular expression. The uniform resource locators (URLs) of all objects to be retrieved by the browser are compared against these rules and if a match occurs then the object is either not retrieved or not rendered by the browser. While a user of the Adblock plug-in may define his/her own rules for filtering, there has also been developed a set of rules (called Filterset.G) that may be shared among users. This large rule set has been developed by incorporating input from multiple users. The rule set is one large generic set of rules, which may be downloaded and used by users of the Adblock browser plug-in.
There are several problems with the use of a filter set as described above. One such problem is the risk of over coverage and resulting false positives. Since it is not possible to create rules that will perfectly filter out unwanted advertising while allowing all desired content to be rendered, there is the danger of certain rules filtering out desired content (i.e., false positives). The more rules in the filter set, the greater the danger of such over coverage and false positives. For example, a rule A, which was developed by a user that often browses news sites, may work fine for filtering content from news sites. However, when a user that browses sports sites uses that same rule, it may incorrectly filter out wanted content, thus resulting in a false positive.
Another problem with the use of a filter set as described above is the time it takes for a user's browser to apply a large set of filter rules. Each rule in the rule set must be applied against the webpage being requested. The application of a large number of rules may significantly delay the time it takes for the browser to retrieve and render the requested webpage. The use of a single large rule set may result in the application of irrelevant rules for a particular users. For example, the rule set may contain numerous rules which are useful in filtering advertisements from sports websites. However, the use of this rule set by a user that never browses sports sites will merely result in additional processing delay, without providing any additional benefit.
Thus, while the use of a browser plug-in to filter out advertising is beneficial in many respects, the use of a large generic rule filter set often results in over coverage and processing delays for the user.