This invention relates to Internet content filtering and more particularly to the filtering of web content stored on a local intranet proxy cache server.
Access by home and business computers to large, unrestricted sources of information, such as those available through the World Wide Web (the xe2x80x9cWebxe2x80x9d) domain on the well-known Internet has increased exponentially in recent years. In many computing environments, it is desirable to restrict access to certain types of information on the Internet and other networks by selected users. For example, educational institutions and parents may wish to allow access only to educational content by access by children. Businesses may, likewise, wish to restrict content accessible over their networks, based upon legal, moral and productivity concerns. Many computers joined in Local Area Networks (LANs) frequently employ a network architecture that includes a proxy cache server to store and dispense Internet content. In a common proxy cache arrangement, a network user (a client), typically represented by a stand-alone microcomputer with an appropriate network interface, requests download of Internet web site content by entering the site""s Uniform Resource Locator (URL) address into a web browser application resident on the client computer. The request is then transferred to a proxy cache server within the local network that may or may not already contain a current copy of the desired web content. If the content is present in the cache, the proxy cache server, rather than placing a call over the Internet to the remote site, instead transmits the requested web content to the client from the local network storage.
FIG. 1 illustrates a generalized architecture for a local network that includes a proxy cache server. The illustrated network is described more particularly in related U.S. patent application Ser. No. 08/905,150, entitled User Name Authentication for Gateway Clients Accessing A Proxy Cache Server. By way of background, further teachings related to a proxy cache server environment are also disclosed in U.S. patent application Ser. No. 09/023,895, entitled Client Inherited Functionally Derived From a Proxy Topology Where Each Proxy is Independently Configured; U.S. patent application Ser. No. 09/195,982, entitled Proxy Cache Cluster; and U.S. Provisional Patent Application Serial No. 60/128,829, entitled Object Cache Storexe2x80x94all of which are assigned to Novell, Inc. of Provo, Utah, and the teaching of each of the aforesaid patent applications being expressly incorporated herein by reference.
Particularly, FIG. 1 illustrates an architecture-level block diagram of a local area network having a proxy cache server and associated applications. The network 20 includes a plurality of clients showing generally by the exemplary client block 22. Each client can comprise a stand-alone microcomputer having a central processing unit (CPU) 24, a memory 26 and a network adapter 28 for communication, all linked by a bus 30. Each client is linked with its own user interface 32 that allows data to be viewed and instructions to be transmitted. The user interface typically includes a keyboard, monitor and a screen-cursor manipulator, such as a mouse. The client is linked to a local network or intranet 34. Packets of data can be transferred over the intranet using the well-known Internet Protocol (IP), or Novell""s improved proprietary protocol, IPX or other common protocols.
The intranet 34 is, likewise, linked with a Novell Directory Services (NDS) server 36, which operates in the commercially available Novell NetWare network operating system environment and other commercially available network operating systems. This server includes its own CPU 38, memory 40 and network adapter 42, linked by a bus 44 to the intranet 34. An associated NDS data storage device, disk 46 is also linked to the server 36. The NDS server 36 and storage device 46 store and distribute data related to client user names. Using proprietary or open standard-based data calls, the clients each pool the NDS server for the unique NDS user name. The NDS user name is used for further communication by the client once it is received over the intranet. A proxy cache server 50 is also provided, linked to the intranet by an appropriate bus. The proxy cache server also contains a CPU 52, memory 54 and network adapter 56. The proxy cache server, in this example, is linked by network link 60 to the well-known Internet communication network 62. A large number of nodes and routers enable transfer of TCP/IP formatted data packets to and from various remote sites. One such remote site consisting of a web server 64 is illustrated. The web server 64 includes its own associated data storage device such as the disk 66. In essence, the proxy cache server 50 acts as a xe2x80x9cfirewallxe2x80x9d between the external Internet 62 and the intranet 34. Requests for web site information are first routed from clients through the intranet 34 to the proxy cache server 50. If the client is authorized to request information from a particular web site, then the information is retrieved from the memory 54 (if such information is already cached in the memory) or it is, at that time retrieved from the remote web site for transfer to the client.
As suggested above, it is desirable that advanced filtering techniques be employed to further ensure that the particular client can only access information from the web that is authorized. In the past this has generally entailed the physical scanning and blocking to selected web content, often on a URL-by-URL basis by the system administrator. The recent rise of independent ratings services that rate the content of a very large number of Internet sites afford an opportunity to automate the filtering function further, and to place it into the province of specialists in the field. Often, however, these services are not readily adaptable to a given network environment and employ a variety of different rating criteria and content categories. It is, therefore, an object of this invention to provide filtering that is readily adaptable to a proxy cache server environment and that enables a variety of different filtering services and databases to be employed with relative ease.
This invention overcomes the disadvantages of the prior art by providing a filter that selectively enables access or blocks requested web information by a client in a local network based upon content rating information stored in connection with a large number of known web sites. Such ratings can be stored based upon the site""s URL address. When ratings are obtained, they can be applied based upon predefined user policies stored in association with the storage bank and authentication mechanism (such as NDS). Content can be stored in the local network in connection with a proxy cache server application.
The ratings can comprise a set of categories and sub-categories into which certain content falls based upon a ratings service subjective criteria. Each list is cached in whole or part in the local network in a ratings cache. The list is updated by action of either the ratings service or the filter and new lists can be transmitted over the Internet, or another network, from a remote vendor site. Each vendor may provide a software module (a NetWare Loadable Module in this embodiment) to implement the manipulation of the provided ratings list by the filter. The module may include update procedures, interpretations and translations of proprietary ratings structures and types of content rated. The filter can be configured to vend requested content, block requested content, or monitor requested content (e.g. vend content, but make log-file entry noting type of content vended and to whom). The vend/block/monitor decision is based upon a variety of criteria including override lists, always-acceptable allow lists and always blocked block page listsxe2x80x94typically dependent on the specific categories associated generally across all URLs, but also upon specific underlying URLs that may or may not be allowable.
A significant advantage to the system and method according to a preferred embodiment of this invention is increased speed resulting from efficient look-up of content ratings, upon which allow, block and vend-but-monitor/warn decisions are based. In the proxy server environment, content ratings are looked-up in the content host cache first. If ratings are not found in the host cache, or are inconclusive, then the object cache is checked. Then the rating cache is checked, and the rating, if found, is placed in the host or object cache for speedier look-up the next time. If a rating is not found in one of the caches, then it is sought over the Internet or another remote location from one or more rating service providers. If found, it is returned and stored for future use. Late rating service providers that scan content for key words and phrases are also used at this time to provide ratings.