The Internet allows for vast amounts of information to be communicated over any number of interconnected networks, computers, and network devices. Typically, information or content is located at websites on one or more servers, and a user can retrieve the content using a user agent, such as a web browser, running on a client device. For example, the user can input a webpage address into the web browser or access a web link, which sends requests to a server to access and provide the content on the respective website. This type of communication is commonly referred to as “web browsing.”
Web browsing is enjoyed by millions of users on the Internet. However, accessing content on a network that is constrained by bandwidth and latency can make web browsing less enjoyable. Bandwidth is the measurement of the speed of a network link. Lower bandwidth network links take more time to transfer content than higher bandwidth links. Latency is a measurement of the responsiveness of a network link. Higher latency networks take more time than lower latency networks to send a single byte of data over a network link.
Many networks can suffer from low bandwidth and/or high latency problems that degrade the enjoyment of web browsing for users. Wireless wide area networks (WANs), such as GPRS or CDMA 1xRTT wireless networks, are just a few networks, along with traditional plain old telephone (POTS) dialup networks, that can exhibit bandwidth and latency problems. These networks may take 50 to 100 seconds to download content from a web page due to bandwidth and latency constraints, whereas a high-speed local area network (LAN) may be less prone to such constraints and can download the same content in 5 to 10 seconds. Waiting a long time to view content for a web page is annoying to users and inefficiently utilizes the network.
Utilizing a network efficiently is also a particular concern for network providers who must share limited resources among many users. For example, wireless WAN providers share very expensive and limited spectrum among all of its data and voice subscribers. Thus, efficient use of the spectrum frequencies is imperative. Furthermore, in a wireless WAN environment, data transmission is more susceptible to interference and noise in contrast to a wired environment. Interference and noise delay the data transmission process and, more importantly, cause variability and unpredictability in the delay. A web site that may download objects in 50 seconds the first time may download the same objects in 100 seconds the next time. Thus, in order to address these concerns, network providers must efficiently use existing network infrastructure to provide the most efficiency to a user when downloading content.
Furthermore, the manner in which information is transferred on a network plays an important role in the network's efficiency. Referring to the World Wide Web (WWW), the Hypertext Transfer Protocol (HTTP) sets forth the rules for transferring content such as files or objects on the web. This protocol uses requests and responses for transferring content. For example, a user agent (e.g., a web browser) sends a request to the content server for a particular file or object of a web page and the server of the web page queries the object in a database and sends back the object as part of a response to the user agent. This process continues until every object in the web page has been downloaded to the user agent.
As web pages have become more complex, a common website may contain hundreds of objects on its web pages. Such objects may include text, graphics, images, sound, etc. The web pages may also have objects located across multiple servers. That is, one server may provide dynamic content (e.g., content that remembers the last books ordered by a user) for a web page, whereas other servers may provide static but rotating content such as an advertisement, and still others provide the static content of the site. As such, before a user can view a web page, hundreds of objects may require downloading from multiple servers. Each server, however, may take a different amount of time to service a request for an object contributing to latency. Thus, the latency for each server may vary with different levels of magnitude, e.g., one server may respond in milliseconds whereas another server may respond in seconds.
Latency constraints, however, should not be confused with bandwidth constraints. FIG. 1 illustrates the retrieval sequence for objects on a bandwidth constrained network using HTTP over TCP/IP. In this illustration, each request for an object requires a connection to be established between a client and a server with an exchange of “SYN” and “ACK” messages necessary for TCP/IP. Due to the relatively small latency of the network and the responsiveness of the server, the ACK message is sent back to the client quickly. However, because the network is bandwidth constrained, a response back to the client takes a relatively long time. This is exacerbated if the object for the request is large in nature and must be broken into many packets as shown in FIG. 1. As a result, the overall download time for each request/response is dominated by the time it takes to download all the packets of the individual objects on a network link. Such download time can be calculated by adding the size of each of the individual objects and dividing the aggregate size by the link bandwidth.
FIG. 2 illustrates the retrieval sequence for objects on a latency constrained network using HTTP over TCP/IP. In this illustration, the network is not limited by bandwidth, but instead by the latency or the time it takes to send a packet from the client to the server through the network. In particular, when a user agent requests small objects on a network affected by high latency, the overall download time is dominated by the time it takes a request to travel to the server, the responsiveness of the server to process the request, and the time it takes for a response to travel back to user agent. The download time of a web page with many objects can be calculated by adding the round trip time (RTT) for the request to travel to the server and the response to travel back to the client in addition to the response of the server and multiplying that by the number of objects on the web page.
Unfortunately, user agents are in fact a source of latency when downloading an object. This latency is a result of the user agent processing the downloaded objects and attempting to display these objects in the manner the web page designers intended. Web page designers use a multitude of different standards to instruct the user agents how a web page is suppose to look once rendered. The number of standards is increasing over time and include markup languages (e.g., Hyper Text Markup Language (HTML), Extensible HTML (XHTML), Wireless Markup Language (WML)), objects that define the overall style of the page (e.g., Cascading Style Sheets (CSS)), objects that are executed by the user agent (e.g., JavaScript), and image objects (e.g., JPEG, GIF, PNG). After downloading each object the user agent needs time to process and determine the impact of each object on the displayed web page. The processing time of each object may impact the download of subsequent objects. For CPU constrained devices (e.g., phones) the latency from browser processing time can contribute significantly to the overall download time of a web page. Also, for poorly implemented user agents certain objects may significantly impact the time to render a web page. Even over a high bandwidth and low latency network, the implementation of the user agent can result in these object processing times severely impacting the download time of the web page.