As the Internet has matured, the characteristics of the available content on the Internet have changed. Sound and video content is now included with the traditional textual content. However, this new content on the Internet requires a greater connection speed (i.e., bandwidth) than was commonly available a few years ago.
FIG. 1 illustrates an example of a typical Internet configuration. It includes a server (such as media server 20), which is coupled to the Internet 30. The server typically includes one or more physical server computers 22 with one or more physical storage devices and/or databases 24. On the other side of an Internet transmission is a client 90, which is connected via one of many available Internet Service Providers (ISPs) 80. Herein, a server is a network entity that sends data and a client is a network entity that receives data.
Cloud 30 is labeled the Internet, but it is understood that this cloud represents that portion of the Internet that only includes that which is illustrated therein. Inside such cloud are the routers, transmission lines, connections, and other communication devices that more-often-than-not successfully transmit data between clients and servers. Inside exemplary Internet cloud 30 are routers 32-44; two satellite dishes 46 and 50; and a satellite 48. The links between these devices represent the possible paths that a data packet may take on its way between the server and the client.
In general, a communication device on a network (such as the Internet) is any device that facilitates communication over the network between two entities, 11 and includes the two entities. Examples of such entities include the server 20 and the client 90.
The Layers of the OSI Model
Open System Interconnection (OSI) model is an ISO standard for worldwide communications that defines a networking framework for implementing protocols in seven layers. Control is passed from one layer to the next, starting at the application layer in one station, proceeding to the bottom layer, over the channel to the next station and back up the hierarchy. A person of ordinary skill in the art is familiar with the OSI model.
Most of the functionality in the OSI model exists in all communications systems, although two or three OSI layers may be incorporated into one. These layers are also called “levels.”
Generally, the hardware implements the physical layer. Such hardware may include a network card, a modem, or some other communications device. Typically, the kernel of an operating system (OS) implements the transport layer.
The top of the stack is the applications in the application layer. This includes any application that communicates with entities outside of the computer, such as a Web browser, a media player, and an email program. The application layer has the least control of details of communication between entities on a network, such as the Internet.
Bandwidth
Bandwidth is the amount of data that can be transmitted in a fixed amount of time. For example, bandwidth between media server 20 in FIG. 1 to media client 90 is calculated by the amount of data (e.g., 1000 bits) that may be transmitted between them in a unit of time (e.g., one second). More specifically, is data may be transmitted between devices at a rate of approximately 56,000 bits per second. That may be called 56 kilo-bits per second (Kbps).
As shown in FIG. 1, a transmission over the Internet travels across multiple links before it reaches its destination. Each link has its own bandwidth. Like a chain being only as strong as its weakest link, the maximum bandwidth between server 20 and client 90 is the link therebetween with the slowest bandwidth. Typically, that is the link between the client 90 and its ISPs 80. That slowest bandwidth is the maximum de facto bandwidth.
Herein, unless otherwise apparent from the context, references to bandwidth between network entities (such as server 20 and client 90) is assumed to be the maximum de facto bandwidth therebetween.
Bandwidth may also be called “connection speed”, “speed”, or “rate”. In references to bandwidth measured by bits per second, it may also be called “bit rate” or “bitrate.”
Streaming Media
Streaming is a technique for transferring multimedia data such that it can be processed as a steady and continuous stream. Streaming technologies are becoming increasingly important with the growth of the Internet because most users do not have fast enough access to download large multimedia files quickly. With streaming, the client browser or plug-in can start displaying the data before the entire file has been transmitted.
For streaming to work, the client side receiving the data must be able to collect the data and send it as a steady stream to the application that is processing the data and converting it to sound or pictures. This means that if the streaming client receives the data more quickly than required, it needs to save the excess data in a buffer. If the data doesn't come quickly enough, however, the presentation of the data will not be smooth.
Within the context of an audio and/or visual presentation, “media” and “multimedia” are used interchangeably herein. Media refers to the presentation of text, graphics, video, animation, and/or sound in an integrated way.
“Streaming media” is an audio and/or visual presentation that is transmitted over a network (such as the Internet) to an end-user. Such transmission is performed so that the presentation is relatively smooth and not jerky. Long pauses while additional frames are being downloaded to the user are annoying to the user. These annoyances encourage a user to avoid viewing future streaming media.
Smoothly Transmitting Streaming Media
Since the bandwidth determines the rate at which the client will receive data, a streaming media presentation may only be presented at a rate no greater than what the bandwidth allows. For example, assume media server 20 needs to send data at 50 Kbps to the client 90 in order to smoothly “play” a streaming media presentation. However, the bandwidth between the client and server is only 30 Kbps. The result is a jerky and jumpy media presentation.
In an effort to alleviate this problem, streaming media presentations are often encoded into multiple formats with differing degrees of qualities.
The formats with the lowest quality (e.g., small size, low resolution, small color palette) have the least amount of data to push to the client over a given time. Therefore, a client over a slow link can smoothly present the streaming media presentation, but the quality of the presentation suffers.
The formats with the highest quality (e.g., full screen size, high resolution, large color palette) have the greatest amount of data to push to the client over a given time. Therefore, the client with a fast link can smoothly present the streaming media presentation and still provide a high quality presentation.
Select-a-Bandwidth Approach
When a server sends streaming media to a client, it needs to know what format to use. Thus, in order to select the proper format, the server must to know the bandwidth between the server and the client.
This easiest way to accomplish this is to ask the user of the client what their bandwidth is. Since a client's link to the Internet is typically the bandwidth bottleneck, knowing the bandwidth of this link typically indicates the actual bandwidth.
FIG. 2 shows a cut-away 100 of a Web page displayed on a client's computer. Inside the cut-away 100, is a typical user-interface 110 that may be used to ask a user what their connection speed is. The user clicks on one of the three buttons 112, 114, and 116 provided by the user-interface 110. If the user clicks on button 112, the server delivers data from a file containing streaming media in a format designed for transmission at 28.8 Kbps. Likewise, if the user clicks on button 114, data sends from a file containing streaming media in a format designed for transmission at 56.6 Kbps. If the user clicks on button 114, the server delivers data from a file containing streaming media in a format designed for transmission at a rate greater than 56.6 Kbps and up-to the typical speed of a T1 connection.
However, the primary problem with the “select-a-bandwidth” approach is that it requires a thoughtful selection by a user. This approach invites selection errors.
It requires that a user care, understand, and have knowledge of her connection speed. Often, a user does not pay particular attention to which button to press. The user may only know that a media presentation will appear if the user presses one of these buttons. Therefore, they press any one of them.
Often, a user does not understand the concept of bandwidth. A user may choose button 116 because she may want to see the presentation at its highest quality. This user does not realize that seeing the presentation at its highest quality may result in a non-smooth presentation because her Internet connection cannot handle the rate that the data is being sent through it.
If she does understand the concept of bandwidth, then the user may not know her bandwidth. A user may simply be ignorant of her bandwidth. In addition, varying degrees of noise may cause varying connection speeds each time a user connects to the Internet. Furthermore, some types of connections (such as a cable modem) can have wide degrees of connection speed depending upon numerous factors.
Moreover, the user needs to understand the implications of an incorrect choice. A user needs to be educated so that she understands that she needs to select an option that is equal to or less than her bandwidth to get a smooth presentation. But she should not choose one that is significantly less than her bandwidth. If she does, then she will be seeing a smooth presentation at a lower quality that she could otherwise see at a higher available bandwidth.
As can be seen by the above discussion, this manual approach is often confusing and intimidating to many user. Therefore, it often results in incorrect selections.
What's more, maintaining multiple files (one for each bandwidth) at the media server adds to the overhead of maintaining a Web site.
Automatic Bandwidth Detection
To overcome these problems, media servers may use a single file containing subfiles for multiple bandwidths. In addition, media servers may automatically detect the bandwidth.
This single file is called a MBR (multiple bit rate) file. The MBR files typically include multiple differing “bands” or “streams.” These bands may be called “subfiles.” A user only clicks on one link. Automatically, behind the scenes, the server determines the right speed band to send to the client.
This automatic speed detection may take a long time. This means that an additional five seconds to a minute (or more) is added to the user's wait for the presentation to begin. This delay for existing automatic speed detection is because of long “handshaking” times while the speed determination is going on.
One existing automatic detection technique involves sending multiple data packets for measuring the speed between the server and client. This technique is described further below in the section titled, “Multiple Measurement Packets Technique.”
Bandwidth Measurement Packets
Typically, automatic bandwidth detection techniques measure bandwidth between entities on a network by sending one or more packets of a known size.
FIG. 3 shows a time graph tracking the transmission of two such packets (Px and Py) between a sender (e.g., server) and a receiver (e.g., client). The server and client sides are labeled so. On the graph, time advanced downwardly.
Time ta indicates the time at the server the transmission of Px begins. Time tb indicates the time at the server the transmission of Px ends. Similarly, Time t0 indicates the time at the client begins receiving Px. Time t1 indicates the time at the client completes reception of Px. At t1, the network hardware presumably passes the packet up the communication layers to the application layer.
Packet Py is similarly labeled on the time graph of FIG. 3. tc is the server time at the transmission of Py begins. td is the server time that the transmission of Py ends. Similarly, t2 is the client time that it begins receiving Py, t3 is the client time that it completes reception of Py. At t3, the network hardware presumably passes the packet up the communication layers to the application layer.
Bandwidth measurement using a single packet. In a controlled, laboratory-like environment, measuring bandwidth between two entities on a network is straightforward. To make such a calculation, send a packet of a known size from one entity to the other and measure the transmission latency, which is the amount of time it takes a packet to travel from source to destination. Given this scenario, one must know the time that the packet was sent and the time that the packet arrived.
This technique is nearly completely impractical outside of the laboratory setting. It cannot be used in an asynchronous network (like the Internet) because it requires synchronization between the client and server. Both must be using the same clock.
Alternatively, the client may track the time it begins receiving a packet (such as t0 for Px) and the time the packet is completely received (such as t1 for Px).
FIG. 3 shows packet Px being sent from a server to a client. Px has a known size in bits of PS. The formula for calculating bandwidth (bw) is
                              bw          ⁡                      (                          P              x                        )                          =                  PS                                    t              1                        -                          t              0                                                          Formula        ⁢                                  ⁢        1        ⁢                  (                      Single            ⁢                                                  ⁢            Packet                    )                    
This technique works in theory, but unfortunately does not work in practice. Only the hardware knows when a packet is initially received. Therefore, only the hardware knows when t0 is.
The other communication layers (such as the transport layer and the application layer) can only discover the time when the packet is completely received by the hardware. That is when the hardware passes it up to them. This completion time for packet Px is t1. It is not possible to calculate bandwidth only one knowing one point in time.
Packet-pair. A technique called packet-pair is used to overcome these problems in asynchronous networks. With packet-pair, two identical packets are sent back-to-back. The server sends a pair of packets, one immediately after the other. Both packets are identical; thus, they have the same size (PS). The bandwidth is determined by dividing the packet size by the time difference in reception of each packet.
Each packet has specific measurable characteristics. In particular, these characteristics include its packet size (PS) and the measured time such a packet arrives (e.g., t0-3 in FIG. 3). Some characteristics (such as packet size) may be specified rather than measured, but they may be measured if so desired.
As shown in FIG. 3, the server sends packet, Px. The client's hardware begins receiving the packet at t0. When reception of the packet is complete at t1, the hardware passes it up the communication layers. Ultimately, it is received by the destination layer (e.g., application layer) at presumably t1.
After the server sends Px (which completed at tb), it immediately sends packet Py at tc. It is important that there be either 1) absolutely no measurable delay between tb and tc or 2) a delay of a known length between tb and tc. Herein, to simplify the description, it will be assumed that there is no measurable delay between tb and tc.
The client's hardware begins receiving Py at t2. When reception of the packet is complete at t3, the hardware passes it up the communication layers. Ultimately, it is received by the destination layer (e.g., application layer) at presumably t3.
FIG. 3 shows no delay between t1 (the time of completion of reception of Px) and t2 (the time reception of Py begins). Theoretically, this will always be the case if Px and Py are transmitted under identical conditions. In practice, is the often the case because Py is sent immediately after Px.
Using packet-pair, the formula for calculating bandwidth (bw) is
                              bw          ⁡                      (                                          P                x                            ⁢                              P                y                                      )                          =                  PS                                    t              3                        -                          t              1                                                          Formula  2(Packet-Pair)            
This technique works in theory and in practice. However, it only works well over a network that is relatively static.
For example, in FIG. 1, assume the network consists of only the server 20; routers 32, 34, and 36; a specific ISP of ISPs 80; and client 90. Further, assume that the links between each node on this static network is fixed and has a consistent bandwidth. In this situation, the packet-pair techniques provide an accurate and effective measurement of bandwidth.
Issues related to using Packet-pair over the Internet. However, the packet-pair technique does not work well over a dynamic network, like the Internet. A dynamic network is one where there is a possibility that a packet may be handled in a manner different from an earlier packet or different from a later packet. In particular, there are problems with a TCP network.
FIG. 1 illustrates examples of handling differences found on a dynamic network. Assume that all packets are traveling from the server to the client (from left to right in FIG. 1). Assume that packets 60-68 were sent back-to-back by the server 20 to the client 90.
Notice, as illustrated in FIG. 1, that packets may take different routes. In addition, some routes may significantly delay the packet transmission. This is especially true if the packet is transmitted via an apparently unusual (but not necessarily uncommon) route, such as wireless transmission, oversees via an underwater cable, satellite transmission (as shown by dishes 46 and 50 and satellite 48), etc. A router (such as router 42) may delay one or more packets (such as 63 and 64) more than another may by temporarily storing them in a memory (such as buffer 43).
Multiple Measurement Packets Technique
To overcome these problems, conventional automatic bandwidth measurement techniques uses multiple packets. A server sends several (much more than two) packets and calculates the speed of each. Conventional wisdom on bandwidth measurement indicates that in order to get accurate measurements several pairs of packets must be sent repeatedly over several seconds to several minutes. Herein, this technique is called “multiple-packets” to distinguish it from the above-described “packet-pair” technique.
Typically, the ultimate bandwidth is determined by finding the average of the many bandwidth measurements. This averaging smoothes out variances in delays for each packet; however, it does not compensate for packet compression during transmission. One of two extremely incorrect measurements will skew the average.
Unfortunately, this technique takes a long time relative the existing wait for the user between click and media presentation. A long time may be five seconds to several minutes depending on the data and the situation. Such a delay adds to the annoyance factor for the user who wishes experience the media presentation. This is not an acceptable delay. Since there are no other options available using conventional techniques, the user has be forced to endure these delays.
No existing automatic bandwidth measurement can nearly instantaneously measure bandwidth across the Internet using a pair of packets. No existing automatic bandwidth measurement can make such measurements at the application layer. Thus, it avoids modifying the operating system. No existing automatic bandwidth measurement addresses measurement distortion caused by packet compression.
Transport Layer Implementation
The conventional approaches typically modify the kernel of the operating system (OS) to do perform automatic bandwidth measurements. More specifically, these approaches modify the transport layer of the OSI model and such layer is often located within the kernel of the OS. In general, such modifications are undesirable because it is generally less stable and more expensive than implementations that do not modify the OS.
If these approaches could be implemented within an application (thus, at the application layer), such modifications would not be possible. However, no existing packet-pair approach measures bandwidth at the application layer. This is because the application layer has less control over the details of the actual communication over the network. In particular, an application has even less control using TCP, than it would with UDP (User Datagram Protocol).
TCP and UDP are discussed below in section titled “TCP and UDP.” The transport and application layers are part of the seven layers of the OSI model discussed below.
TCP and UDP
Over the Internet (and other networks), packets of data are usually sent via TCP or UDP protocols. TCP is the universally accepted and understood across the Internet.
TCP (Transmission Control Protocol) is one of the main protocols in TCP/IP networks (such as the Internet). Whereas the IP protocol deals only with packets, TCP enables two hosts to establish a connection and exchange streams of data. TCP guarantees delivery of data and guarantees that packets will be delivered in the same order in which they were sent.
UDP (User Datagram Protocol) is a connectionless protocol that (like TCP) runs on top of IP networks. Unlike TCP/IP, UDP/IP provides very few error recovery services, offering instead a direct way to send and receive packets (i.e., datagram) over an IP network.
A packet is a chunk of data provided by the application program. UDP typically sends a single “application-level packet” as a single UDP packet. However, TCP may break a single application-level packet into multiple smaller TCP “segments”, each of which is treated as a separate “packet” at the TCP layer. The Nagle Algorithm (discussed below) does the opposite: It takes multiple small application packets and combines them into a single larger TCP segment.
Nagle TCP/IP Algorithm
The Nagle Algorithm was designed to avoid problems with small TCP segments (sometimes called “tinygrams”) on slow networks. The algorithm says that a TCP/IP connection can have only one outstanding tinygram that has not yet been acknowledged. The defined size of a tinygram depends upon the implementation. However, it is generally a size smaller than the size of typical TCP segments.
The Nagle Algorithm states that under some circumstances, there will be a waiting period of about 200 milliseconds (msec) before data is sent. The Nagle Algorithm uses the following parameters for traffic over a switch:                Segment size=MTU or tcp_mssdflt or MTU path discovery value.        TCP Window size=smaller of tcp_sendspace and tcp_recvspace values.        Data size=application data buffer size.        
The following are the specific rules used by the Nagle Algorithm in deciding when to send data:                If a packet is equal to or larger than the segment size (or MTU), and the TCP window is not full, send an MTU size buffer immediately.        If the interface is idle, or the TCP_NODELAY flag is set, and the TCP window is not full, send the buffer immediately.        If there is less than half of the TCP window in outstanding data, send the buffer immediately.        If sending less than a segment size buffer, and if more than half the window is outstanding, and TCP_NODELAY is not set, wait up to 200 msec for more data before sending the buffer.        
Setting TCP_NODELAY on the socket of the sending side deactivates the Nagle Algorithm. All data sent will go immediately, no matter what the data size.
The Nagle Algorithm may be generically called the “tinygram-buffering” function because it buffers tinygrams.
TCP Slow Start Algorithm
On TCP networks that don't use “slow start,” devices start a connection with a sender by injecting multiple packets into the network, up to the window size advertised by a receiver. While this is acceptable when the two hosts are on the same LAN (local area network), problems may arise if there are routers and slower links between the sender and the receiver. Since some of the intermediate router is likely to queue the packets, it is possible for that such a router will have insufficient memory to queue them. Therefore, this naive approach is likely to reduce the throughput of a TCP connection drastically.
The algorithm to avoid this is called “slow start.” It operates by observing that the rate at which new packets should be injected into the network is the rate at which the acknowledgments are returned by the other end.
The Slow Start Algorithm adds another window to the sender's TCP: a congestion window, called “cwnd”. When a new connection is established with a host on another network, the congestion window is initialized to one packet. Each time an acknowledgement (i.e., “ACK”) is received, the congestion window is increased by one packet. The sender can transmit up to the minimum of the “congestion window” and the “advertised window.” The “congestion window” is flow control imposed by the sender. The “advertised window” is flow control imposed by the receiver. The former is based on the sender's assessment of perceived network congestion. The latter is related to the amount of available buffer space at the receiver for this connection.
The sender starts by transmitting one packet and waiting for its ACK (acknowledgement). When that ACK is received, the congestion window is incremented from one to two. Now, two packets can be sent. When each of those two packets is acknowledged, the congestion window is increased to four. And so forth.
At some point, the capacity of the connection between the sender and receiver may be reached. At that point, some intermediate router will start discarding packets. This tells the sender that its congestion window has reached its limit.
Proxy
A proxy (i.e., proxy server) is a device that sits between a client application (such as a Web browser) and a real server. Generally, it intercepts all requests to and from the real server to see if it can fulfill the requests itself. If not, it forwards the request to the real server. A proxy is employed for two main purposes: Improve performance and filter requests.
Since the proxy server is often a central point of communication for a number of clients, it attempts to make its communications as efficient as possible. Thus, it typically implements a form of the Nagle Algorithm. Every new TCP connection start with Slow Start. When there is a proxy between the client and the server, slow start is run in the two connections: server-proxy and proxy-client. Therefore, the proxy adds new complexity to the packet pair experiment.