The Internet is a vast collection of resources from around the world with no sort of "central" or main database. Instead it is a collection of thousands of computers, each with their own individual properties and content, linked to a network which is in turn liked to other networks. Many of these computers have documents written in the Hypertext Mark-up Language ("HTML") that are publicly viewable. These HTML documents that are available for public use on the Internet are commonly referred to as "Web Pages". All of the computers that host web pages comprise what is known today as the World Wide Web ("WWW").
The WWW is comprised of an extremely large number of web pages that is growing at an exponential amount every day. A naming convention known as a Uniform Resource Locator ("URL") is used to designate every web page on the Internet. Web pages are typically assigned to the subclass known as the Hypertext Transport Protocol ("http") while other subclasses exist for file servers, information servers, and other machines present on the Internet. URLs are an important part of the Internet in that they are responsible for locating a web page and hence, for locating desired information. "Linking" is another method of providing URLs to an Internet user. When the user accesses any given URL, other "links" to further URLs may be present on the web page. This expanding directory structure is seemingly infinite and can result in a single user seeking one web page, to compile a list of hundreds of new web pages that were previously unknown.
Large amounts of information are available on the WWW and are easily accessible by anyone who has Internet access. In many situations it is desirable to limit the amount and type of information that certain individuals are permitted to retrieve. For example, in an educational setting it may be undesirable for the students to view pornographic or violent content while using the WWW.
Until now, schools have either ignored inappropriate material available on the Internet or attempted to filter it with software originally designed for home use on a single computer, while others have tried to convert their filtering products to proxy servers so that they may filter entire networks. "Yes Lists" and "Content Filtering" are other industry methods, which have found use in this area, albeit with less success. Conventional "filtering" has several inherent flaws, despite the fact that it is considered the best alternative of inappropriate site management. If a filter list is broad enough to ensure complete safety for its users, unthreatening material is inevitably filtered along with material considered to be appropriate. This leads to a reduction in the versatility of the Internet and the possibility of censorship accusations. On the other hand, if the filter list is too narrow, inappropriate material is more likely to pass through to the users. In addition, the filter vendor is in control of defining the filter list. This results in the moral and ethical standards of the vendor being imposed upon the user. All this, combined with the speed at which inappropriate sites appear on the Internet, and the Internet search engines' tendency to present newer web sites first, the sites least likely to be in filter list tend to be most likely to appear at the top of search results.
A "Yes List" is the safest method of protecting students on the Internet. However, it is the most expensive to administer, and it dramatically reduces the benefits of the Internet in an educational setting by being the most restrictive. "Yes Lists" require the teachers to research the Internet for materials they wish students to have access to, then submit the list of suitable materials to an administrator. The administrator then unblocks these sites for students access, leaving all non-approved sites fully blocked and non-accessible.
The final method of managing inappropriate material is "Content Filtering". This involves scanning the actual materials (not the URL) inbound to a network from the Internet. Word lists and phrase pattern matching techniques are used to determine if the material is inappropriate or not. This process requires a great deal of computer processor time and power, slowing down Internet access and also making this a very expensive alternative. Furthermore, it is easily defeated by pictures, Java, or some other method of presenting words/content without the actual use of fonts.
These and other drawbacks exist.