1. Field of the Invention
The present invention is generally related to improving access to information resources on networked information server systems and, more particularly, to server systems and processes for efficiently addressing external servers, and hyperlink references, for purposes of improving access to information resources.
2. Description of Related Art
The recent substantial growth and use of the internationally connected computer network, generally known as the Internet, has largely been due to widespread support of the hypertext transfer protocol (HTTP). This protocol permits client systems connected through Internet Service Providers (ISPs) to access independent and geographically scattered server systems also connected to the Internet. Client side browsers, such as Netscape Mozilla and Navigator (Netscape Communications Corp.) and NCSA Mosaic, provide efficient graphical user interface based client applications, implemented in the client side portion of the HTTP.
The distributed system of communication and information transfer made possible by the HTTP is commonly known as the World Wide Web (WWW, Web). Internet Service Providers (xe2x80x9cISPsxe2x80x9d) provide Internet access to customers and their network access equipment is networked between the client computer running the browser and the Web server providing the Point of Presence (xe2x80x9cPOPxe2x80x9d) for the user. A subscriber to an online service typically accesses the service using special client-based communication software. This software establishes and manages a connection from the subscriber""s computer to the service provider""s host computers (usually a group of servers linked to a LAN or WAN) and facilitates the subscriber""s interactions with the service.
The Web utilizes the HTTP client/server protocol, which is a request-response protocol. HTTP transactions include four stages: connection, request, response, and disconnection. In the connection stage, the client attempts to open a network connection to the server. Unless otherwise specified, HTTP attempts to use port 80 on the server for this connection. Establishing a connection involves one roundtrip time from the client to the server, as the client requests to open a network connection and the server responds that a network connection has been opened.
After a network connection is open, the client may send an HTTP request to the server in the request stage. A request stage involves one half of a round-trip time as the request goes from the client to the server. Once the server receives the request, the server responds by sending a response to the client in the response stage. As with the request, the response stage involves one half of a round-trip time as the response goes from the server to the client. The disconnection stage closes the network connection to the server. This stage involves one half of a round-trip time and may occur many different ways.
The World Wide Web makes hypertext documents available to users over the Internet. A hypertext document does not present information linearly, like in a book, but instead provides the reader with links or pointers to other locations so that the user may jump from one location to another. The hypertext documents on the Web are accessed through the client/server protocol of Hypertext Transport Protocol (HTTP).
These hyper-links often appear in the browser as a graphical icon or as colored, underlined text. A hyper-link contains a link to another Web page. Using a mouse to click on the hyper-link initiates a process which locates and retrieves the linked Web page, regardless of the physical location of that page. Hovering a mouse over a hyper-link or clicking on the link often displays in a comer of the browser a locator for the linked Web page. This locator is known as a Universal Resource Locator, or URL.
The URL is used for accessing resources on the Internet, such as hypertext mark up language (HTML) documents, images, sound files, database search engines, etc. The URL identifies a domain, a host within that domain, and sometimes a resource or file within a directory structure on the host computer. Domains can be thought of as a group of computers, such as all computers on a company""s network. For example, the domain xe2x80x9cibm.comxe2x80x9d identifies a domain for the commercial company IBM, which may include thousands of individual computers. Typically the URL identifies only those computers which are servers on the Web by prefixing the domain with a host name. Thus the URL xe2x80x9chttp://www.ibm.comxe2x80x9d identifies an individual host computer(s) within the ibm.com domain which operates as a Web server for IBM, and can process the request embodied in the URL. The word HTTP tells the host to use the hyper-text transfer protocol while delivering files over the Internet. The files to be delivered can be provided from resources such as database queries or execution of scripts by the host, as well as traditional data files. There are other protocols that can be used on the Web, such as File Transfer Protocol (FTP).
From a client side user interface perspective, a system of uniform resource locators (URLs) is used to direct the operation of a Web browser in establishing transactional communication sessions with designated Web server computer systems. In general, each URL is of the basic form
http:// less than server_name greater than . less than sub-domain.top_level-domain greater than / less than path greater than 
The server name is typically xe2x80x9cwwwxe2x80x9d and the sub_domain.top-level_domain is a standard Internet domain reference. The path is an optional additional URL qualifier, which defines the location of the resources and usually contains a directory structure that leads to a particular file. The URL scheme of the Internet is flexible and adaptable, based on well-known conventions, so that application servers receive sufficient information to process a client request.
User""s selection of a URL on the client side results in a transaction being established in which the client sends the server an HTTP message referencing a default or explicitly named data file constructed in accordance with the hypertext mark up language. This data file or Web page is returned in one or more response phase HTTP messages by the server, generally for display by the client browser. Additional embedded image references may be identified in the returned Web page resulting in the client browser initiating subsequent HTML transactions to retrieve typically embedded graphics files. A fully reconstructed Web page image is then presented by the browser through the browser""s graphical user interface.
A Web server site may contain thousands of individual Web pages. The location of the file or resource containing a desired page is identified by appending a directory-path file name to the host and domain names in the basic URL to form a complete URL. Thus the URL xe2x80x9chttp://www.ibm.com/dira/dirb/dirc/intro.htmlxe2x80x9d identifies a hyper-text markup-language file called xe2x80x9cintro.htmlxe2x80x9d which resides on a host named xe2x80x9cwwwxe2x80x9d within the ibm.com domain. The file resides in the dira directory, in its dirb/dirc subdirectory. Often this HTML file contains references to other files which are loaded automatically by the client""s browser.
While the URL is used to locate a file on a host within a domain, it conventionally does not contain a physical address for the host computer. Addresses of computer machines on the Internet are specified using a 32-bit numeric identifier known as the Internet-Protocol (IP) address, assigned to each computer so that no two machines have the same IP address. The IP address is often written as four decimal numbers separated by periods. Each decimal number represents an 8-bit binary number, from zero to 255 in decimal notation. Thus a computer in IBM""s domain might have the IP address 209.180.55.2 while another computer in that domain might have the address 209.180.55.103.
FIG. 1 illustrates a block diagram of a typical server system, having multiple computers networked over the Internet 12, with a series of high-speed communications links, which may be located between educational, research and commercial computer sites. The Internet computers utilize the Transmission Control Protocol/ Internet Protocol (TCP/IP) as the communications protocol which can network very diverse and dissimilar systems.
The server system contains a client computer 10, serving browser, and servers, such as server 14. Server 14 contains its own data storage device 20 with copies of data files, including requested file. Typically, client computer 10 browser initiates a communication session with the remote server 14 by the user selecting a URL, perhaps by mouse-clicking on a hyper-link to a new Web page. Browser imbeds requests and commands and a small amount of data in URL""s, which are transmitted to the server 14. Each URL contains about 50 to 150 bytes of information. A URL often contains information other than a requested file description. For example, when the user of browser mouse-clicks on a bitmap image displayed on a Web page, the relative coordinates of the mouse""s location when the mouse click occurred are also included in the URL: http://www.round.com/cgi-bin/coo.cgi? 102,315. Server 14 decodes the coordinates in the URL and determines where on the Web page the user mouse-clicked.
In conventional systems a host server name xe2x80x9cwww.round.comxe2x80x9d, with the URL xe2x80x9chttp://www.round.com/file.htmlxe2x80x9d, is typically sent to the domain-name-system (DNS) server, which is a special Internet server with a look-up table. DNS server is often a special server at an Internet Service Provider which contains most or all domain names on the entire Internet, or in a local region of the Internet. One DNS server may have to refer the request to another DNS server for unknown host-names.
DNS server looks through the look-up table and finds an entry for the host www.round.com. This entry contains a physical IP address for the Web-server host in the domain round.com. This IP address, such as 18 230.101.17.101, is returned to the client browser. Browser may then store this IP address in client computer cache for future use, a process known as browser caching of the IP address.
Browser then uses the IP address to initiate a communication session with the remote computer which physically has the desired Web page, such as the www.round.com server having the file.html file, in order to retrieve a file from a remote server. Once the session with server 14 is established, URL is sent to the server 14. Server 14 then accesses the data storage device 20 which includes requested file, the file.html Web page. A file copy of requested file is sent back to client browser, which re-constructs the Web page from file copy and displays the Web page on the client computer 10. Subsequently, other files may also be transferred, such as graphic image files which were not directly requested by the URL, but are referenced by the file.html file.
There is a significant amount of latency in conventional computer communication networks, occurring while the client waits for a response from the Web server. Accordingly, there is a need for systems and methods of increasing the performance of the computer networks, preferably without requiring modification of existing browsers. Performance may be increased by the network access equipment sending a resource file physical I/O address in a URL request to either a Web server or directly to a data storage device, thus by-passing some data storage device access layers, including the file I/O layer.
The foregoing and other objects, features, and advantages of the present invention will be apparent from the following detailed description of the preferred embodiments which makes reference to several drawing figures.
One preferred embodiment of the present invention includes a method of utilizing a universal-resource locator (URL) addressing scheme, by a client network access equipment, for efficiently accessing resource files on a data storage device attached to a networked server system. The method uses a client computer browser for sending a URL request for a resource file. The URL request includes the pre-resolved resource file physical I/O address, thereby eliminating some data storage device access procedures, like logical address mapping. The requested resource file is accessed on the data storage device directly, through the physical I/O address, and then transferred between the data storage device and the client network access equipment.
The URL request resource file physical I/O address is preferably embedded in the client computer browser page URL link, pre-establishing a correspondence between the browser page element and the resource file. In this invention the network may be the Internet, the client computer browser a World Wide Web browser, the server system a Web server, and the resource file a Web page. The Web page has selectable items, including hyper-text objects, and the hypertext objects include the predetermined embedded URL link.
The other embodiment of the present invention is the system used with this method embodiment.
Yet another embodiment of the present invention is a method wherein the URL request is directly sent to the data storage device controller, without first sending a Hyper-Text Transfer Protocol (HTTP) request to a server. In the system embodiment of the present invention corresponding to this method embodiment, the data storage device controller is directly connected to the network and has a destination IP address, to allow accessing the requested resource file on the data storage device directly, and to allow the transfer of the requested resource file, between the data storage device and the client network access equipment, to be directly performed by the data storage device controller. The data storage device is preferably used with a SCSI or IDE controller protocol.