Large numbers of web content data and web sites are added to the Internet every day. As users become more selective about the content that they access, Internet security, rapid proliferation of malicious Internet sites, and the relatively short lifetime of many of web pages, continue to present challenges to a pleasant user experience. Website classification has been used to filter undesired websites, or at least present the user with a notification of the classification of a particular website that the user is about to access.
Website classification continues to be challenging. Limited or non-existent data provided within websites encumbers the use of programmatic classification methods that rely on the web content data in order to classify the web site. Furthermore, websites with large numbers of web content data require processing time beyond acceptable levels.