In a networked data communications system, users have access to terminals which are capable of requesting and receiving information from local or remote information sources. In such a system the terminal may be a personal computer (PC), a cellular phone, a mobile data terminal, a radio modem, a portable computer, a personal digital assistant (PDA), a pager, or any other similar device. The capability of the terminal to request and receive information may be provided by an application program or other such mechanism. A terminal provided with these capabilities is referred to as a browser.
In such a system the information source may be a server (e.g., a host computer) coupled to a mass information storage device (e.g., a hard drive disk pack). The exchange of information (i.e., the request and receipt of information) between the terminal and information source is facilitated by a connection referred to as a communication channel. The communication channel may be physically realized via a wire (e.g., a telephone line), a radio signal (e.g., a radio frequency (RF) channel), a fiber optic cable, a microwave link, a satellite link or any other such medium or combination thereof connected to a network infrastructure. The infrastructure may be a telephone switch, a base station, a bridge, a router, or any other such specialized component, and facilitates the connection between the browser and the network. Collectively, the interconnected group of terminals, physical connections, infrastructure and information sources is referred to as a network.
The network itself may take a variety of forms. It may be located within a small, local geographic area, such as an office building, and consist of only a limited number of terminals and information sources. This type of network is commonly referred to as a Local Area Network (LAN). On a broader scale, it may be larger and support more users over a wider geographic area, such as across a city or state. This type of network is commonly referred to as a Wide Area Network (WAN). On an even broader scale the LAN and WAN networks may be interconnected across a country or globally. An example of a globally connected public data communications network is the Internet.
To a user the Internet appears to be a single unified network, although in reality it consists of hundreds different types of computer platforms utilizing many diverse data communications technologies. The technologies are connected together in such a manner so they appear transparent to the user. This transparency is made possible through the use of a standard communications protocol suite known as Transmission Control Protocol/Internet Protocol (TCP/IP).
Recently, Hypertext Markup Language (HTML) and Hypertext Transfer Protocol (HTTP) in particular have developed to make the World Wide Web very accessible. The exchange of information on the Web is further facilitated through hypertext documents. Hypertext documents are unique in that they use tags to define links (i.e., highlighted or underscored words or phrases) which, when selected, fetch the related information from within the same document or from a new document altogether. The links are defined using HTML which provides a document formatting method that adapts in a consistent manner to any computer on which it is displayed. HTML tags are used to define the various components of an ASCII text file which make up a hypertext document, including such things as formatting and linking to other documents. Tags which link documents on one Web information source to those on another do so by associating a Uniform Resource Locator (URL) with the referenced information. The ability to link Web files of similar and/or differing formats to each other, and to link documents on other Internet sites, is a very powerful feature of the Web.
The development of sophisticated browsers specifically for the Web, (i.e., browsers which utilize HTTP to request and receive HTML documents) have also helped to further increase its use and popularity. Standard web browsers, such as Mosaic.TM. or Netscape.TM., adhere to standard HTML and HTTP protocols and conventions.
The appeal of the Internet is the large-scale interconnection of public and private networks. A concern exists, however, about "un-authorized" access from public networks to the attached private networks. This concern has resulted in the development of proxies. A proxy is a host computer or mechanism (usually an application program) on a network node which performs specialized functions on a network. One such function is to provide network security. Security is provided between a private and public network by requiring communications (i.e., information exchanges) to pass through the proxy. Another function of a proxy is to store or cache recently accessed information (i.e., copies of documents and images). If a browser desires information which is located outside the local network that is to say on an information source attached to an external network, communications pass from the browser through the proxy before going on to the external network.
Thus a proxy may operate to deny access to a private network from a public network by not replying to HTTP commands received from the public network.
Also a proxy may operate to deny access to specific Web sites, for example sites potentially offering undesirable information. This is achieved by maintaining a list of URLs at the proxy to which access is to be denied. HTTP commands which contain these URLs are not executed by the proxy and are responded to with a predefined message. It is also achieved by identifying a particular string in a HTTP command and sending the predefined message if such a string is identified.
While proxies address security problems, there are other problems which need to be addressed, such as those exacerbated by a low band-width connection to the browser, or access to undesirable information.
There is a need for an improved method of accessing and retrieving information in a networked data communications system. There is also a need for an improved proxy for accessing and retrieving information between a networked data communications system and a browser.