As known in the art, computers connected to the Internet use the well-known transmission control protocol/internet protocol (TCP/IP) to negotiate the network communications with other computers on the network. TCP/IP network packets are transmitted to other computers using an IP address to identify the source and destination computers. An IP address is currently defined as a 32-bit number which is generally expressed as four octets (converted to their decimal values) separated by a period, for example, 12.34.56.78. Due to the very large number of computers connected to the Internet, it would not be convenient for the users to memorize the IP address assigned to each of the computers being accessed. Accordingly, a Domain Name System (DNS) was implemented whereby a computer may be identified by a mnemonic host name, such as www.whitehouse.gov.
DNS is a name resolution method that allows the users and applications to initiate network communications with a hostname, without an IP address, for other computers on the network. The DNS server maintains a database of hostnames and their corresponding IP addresses. The users can open a web page on his or her web browser by directing the application to connect to a particular universal resource location (URL) which identifies the web server and the particular document to be downloaded to the browser. When the sending computer or application needs to open a network connection to another computer, it first contacts a DNS server to resolve the other computer's hostname to its IP address. DNS servers are distributed throughout the Internet. DNS servers communicate with other DNS servers to resolve a network address.
The standard convention for a URL is ‘protocol://host's name/name of file.’ The protocol includes, for example, FTP (file transfer protocol), telnet and HTTP (hypertext transfer protocol). Typically, HTTP is used to transfer information (also referred to as “content”) from a web server application for display by web browser (a web client computer application). HTTP is the set of rules for exchanging files, for example, text, graphic images, sound, and video, in the Internet. Content is generally organized into groups of data, referred to as a “web page,” defined in documents downloaded from the web server to the browser. The web page is a text file that contains text and a set of HTML (hyper text markup language) tags that describe how the text should be formatted when a browser displays the web page for the user.
A HTML tag is a code element that tells the web browser what to do with the text. Each tag appear as letters or words between a ‘<’ and a ‘>.’ For example, <HTML> tells the browser that this is the beginning of an HTML document and <TITLE> tells the browser that this is the title of the page. HTML defines a document format, for example, the page layout, fonts and image elements (graphic elements). Each of the tags defining an image element includes the location of the image element, for example, <img src=“URL”> or <img src=“name of the file”>. The HTML document also has ability to link text and/or an image to another document or section of a document. Each link contains the URL of a web page residing on the same server or any server in the internet, for example, <a href=“URL”>. The web browser interprets the set of HTML tags within the HTML document and displays for the user.
FIG. 1 is a schematic diagram of web page 100 as it may be displayed on a client computer. Web page 100 may include a plurality of textual information, represented by text displays 102 and 104 in FIG. 1. Web page 100 may also include image elements 106, 108, 110 and 112. These image elements are displayed on the web page via instructions to download image elements from a URL in an HTML document. For example, URL 116 is associated with image element 106 as shown in FIG. 1. Similarly, URLs 118, 120 and 122 are associated with image elements 108, 110 and 112, respectively. Each image element is independently downloaded. The URLs are shown in FIG. 1 with a dashed outline to indicate that the actual URL is not typically displayed on the web browser, while the image elements specified by the URLs are displayed.
FIG. 2 is a schematic diagram showing a basic architecture used to provide web-based services. This architecture includes client computer 200 and server 202. Client computer 200 can include a processor 204 coupled via bus 206 to network port 208, and memory 210. Client computer 200 can communicate with server 202 via network 212. Server 202 can include a processor 214 coupled via bus 216 to network port 218, and memory 220. One or more routers may be used within network 212 to direct network packets to their destination. Router 222 is one such router. The function and operation of conventional IP routers are well-known in the art. For example, router 222 receives network packets from client computer 200. For each of the network packets, the router determines the best available route, using one or more routing tables, and sends the packets to their destination via the best available route.
FIG. 3 is a flow diagram showing steps used in conventional web-based systems to download a web page. In step 300, the web browser receives a user's request to open a particular URL. In step 302, the web browser sends a request for name resolution to router 222 (shown in FIG. 2) and the router forwards the request to a DNS server. In step 304, the DNS server responds to the request and the web browser receives the IP address assigned to the web server. In step 306, the web browser opens a network connection with the IP address supplied by the DNS server and sends an HTTP request to the web server, asking for the file. In step 308, the web server responds to the request and the web browser receives an HTML document for the web page from the web server. Once the HTML document has been downloaded, the web browser closes the network connection in step 310
Next, in step 312, the web browser examines the HTML document and determines whether or not there are additional image elements to be downloaded for display within the web page. If there are no additional image elements to be downloaded for display, the process ends. Otherwise, in step 314, the web browser requests name resolution for the web server's hostname indicated in the URL associated with the image element. This URL is indicated within the HTML document downloaded in step 308. The DNS server responds to the DNS lookup request by providing the IP address corresponding to the web server's hostname. In step 316, the web browser receives the IP address assigned to the web server. In step 318, the web browser opens a network connection using the IP address supplied by the DNS server. In step 320, the web browser downloads the image element specified in the user's URL request. Once the image element has been downloaded, the web browser closes the network connection in step 322. The process repeats steps 312-322 until all image elements identified in the HTML document have been downloaded.
As can be seen from the steps shown in FIG. 3, the web browser may make numerous DNS lookup requests each time a single web page is downloaded, even though the web page typically references image elements that are stored on the same web server host as the web page. The repeated DNS lookup operations generally requests name resolution for the same host in numerous succession just to render a single web page. If an HTML document (web document) for a web page includes, for example, ten different image elements, the web browser will perform a total of eleven DNS lookup operations (one to download the HTML document and one operation for each image element), even if the DNS lookup operations are requesting name resolution for the same host. As web content developers continue to increase the complexity of web pages, the number of image elements within a particular web document may become very large. Accordingly, the load on DNS servers has increased. The load on DNS server 224 is furthermore increased due to the multiple requests for the same information. Not only can the DNS server itself be impaired due to the increased load, but the network traffic across network 212 is increased with each DNS lookup request resulting in poorer performance across the network.
A need therefore exists for systems and methods of reducing requests for name resolution for web-based services.