A computer or communication network administrator is a person who manages a computer or communication network through which information flows. Companies typically hire network administrators to oversee the operations of the networks or parts of networks that the companies own. Key functions performed by these network administrators include: 1) routing the flow of electronic information (which may be referred to as "traffic") between two points; 2) determining when more network links (electronic connections) are needed between two points; 3) determining when a higher capacity processor is needed at a particular location in the system; and 4) determining that there is more equipment serving the network than is required.
In performing these functions, a network administrator continually tries to measure the demand that users of the network place on network resources. For example, the network administrator may attempt to identify those parts of the network that are experiencing more demand than can be efficiently supported. Traditionally, a network administrator estimates the demand for the resources of the network simply by correlating demand to the traffic observed at those resources. This approach, however, can provide a highly inaccurate picture of the actual demand, as discussed more fully below.
The problem recognized by the present invention is that present measures of actual traffic carried by a network resource may not truly indicate how large of a demand is placed on that resource. This may result, in part, from the functioning of the "transport protocol" typically used by a network to manage electronic traffic. An example of a transport protocol is Transmission Control Protocol (TCP) (developed around 1980), which is used to manage much of the traffic on the Internet.
Traffic
A primary example of traffic within a network is the transfer of files among users. During a file transfer, the users want a reliable stream delivery service. A reliable stream delivery service sends a single and complete copy of the file in the proper order to the transfer destination. Transport protocols govern how users send files across a single network or across many connected networks.
To accomplish reliable service, a transmitter following the protocol breaks the file into smaller packets of information. This transmitter arranges these packets in the proper order and begins sending them to their electronic destination in a "stream." Each information packet includes some overhead; part of this overhead provides routing information, sequencing information, and a cyclic redundancy check to recognize corrupted packets. The receiver reassembles the packets to form a complete copy of the file being transferred. To serve their purpose, the packets are small enough to make it unlikely they will be corrupted during transmission. If the packets do become corrupted, their small size also makes it convenient to re-transmit the missing information without the need for re-transmitting the entire file.
As the packets are received, the receiver checks on their accuracy with the help of the cyclic redundancy check. If an arriving packet is accurate, its information is used, and if the packet contains data, the receiver eventually returns an acknowledgment to the transmitter. By monitoring acknowledgments from the receiver, the transmitter can infer whether the data packets being sent are arriving completely and accurately, so that the transmitter can decide whether to continue sending new data packets or to retransmit old ones.
Congestion
Congestion of a network resource occurs when many users simultaneously try to send files across the network. Typically a network installs enough equipment to carry the traffic it expects to receive, but expectations are not 100% accurate. Thus, there are certain times when a network is under-utilized and not carrying much traffic and there are other times when there is more demand for the network resources than it is designed to carry.
In the case of a network with insufficient capacity to serve its customers well (i.e., more demand than the network is designed to carry), the network administration should respond by increasing the capacity of the network. For example, he or she may purchase more or better equipment. However, because of cost, a network administrator is unlikely to invest the funds necessary to upgrade a network unless the administrator is relatively certain of the need for doing so.
As mentioned, the traditional method for determining the demand on a network, upon which an administrator may rely in making an upgrade decision, is simply the amount of traffic arriving at the network resources. If the amount of traffic in the oversubscribed network is not exceeding expectations, then an administrator following the traditional method would have no reason to invest in a network with greater capacity. However, a problem with simply using the volume of arriving traffic as the estimate of demand on a network resource is that such a method may not accurately measure demand because some transport protocols are configured to sharply limit the number of packets being sent and received whenever there is the slightest hint of a problem: When any disruption of the traffic occurs (i.e., an information packet is lost or inaccurately transmitted) transport protocols respond by drastically reducing the volume of the traffic being sent. Thus, when a measure is taken of how much traffic is being carried by the resource at such a time, the network administrator will see reduced traffic even though many users may be trying to use the resource (i.e., even though demand is high). Thus, a simple measure of traffic at these times does not reflect actual demand.
The present invention addresses this problem, which has gone unrecognized because of its nature. The name given the problem by the inventor is "camouflaged congestion." Because, as noted above, transport protocols, especially TCP, respond to any problems they encounter in transmitting information by reducing the number of packets being sent (i.e., by reducing the traffic), a resource may experience both excessive demand and a deceptively low traffic flow, and this congestion is "camouflaged"; the network administrator, who is looking only at the volume of arriving traffic, will be mistakenly unaware of the need to increase the capacity of the network.
As noted, a situation during which traffic disruption may occur is when a large number of users are trying to send data over a single link between different parts of a network. The many information packets that are generated for each of the files being transferred can't all be sent simultaneously. As a result, the packets are lined up in a queue to be sent as soon as the resource can accommodate them. If too many packets are waiting to be sent at the same time, the memory buffer of the network resource cannot contain them and the queue overflows. Packets being lined up when the queue overflows may be lost completely.
In a network that handles internally only smaller packets, which are called cells, while segmenting and reassembling users' large packets as they enter and leave the network, the problem may be made worse. Specifically, if a queue for an internal resource overflows in such a network, it is natural to drop arriving traffic on a cell-by-cell basis. In such a case, the resource will often be serving cells that can no longer be reassembled into whole packets. Even more serious than the obvious waste of the congested resource's time on useless cells are the traffic-reducing reactions of the users' protocols to damaged packets. Individually, the reaction to a damaged packet is the same as to a lost packet, but because cell-by-cell dropping damages more packets (since the cells may be dispersed among packets), these damages, together with the cumulative reactions of users' protocols to cell damage are more severe.
Overflows
To help avoid continuing overflows, TCP and similar transport protocols were designed to detect network congestion when packets are lost or corrupted during transmission. As noted above, in responding to the congestion, the transport protocols reduce the traffic being sent: TCP maintains a window which limits the number of information packets the TCP source allows itself to have outstanding.
For example, suppose the user of a TCP connection wants to transfer a large file of a thousand packets. At this point in the connection, when the file transfer begins, the TCP source may have a window size of ten packets. This means TCP will send packets 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10. TCP will then wait for an acknowledgment from the receiver that at least packet 1 had arrived intact before sending packet 11. If the receiver acknowledges having received packets 1 and 2, TCP will send packets 11 and 12. In this way, TCP always has no more than 10 packets in transmission simultaneously.
Now suppose the packets encounter congestion during transmission and become lost or corrupted. Perhaps after TCP sends packets 11 and 12, an unusually long time passes and no acknowledgment of packet 3 is received. TCP will eventually assume that packet 3 was lost in transmission. TCP will then retransmit packet 3, but it will also reduce the size of its window; that is TCP will reduce the number of packets that can be in transmission at any one time. TCP is typically configured to cut its window size in half when adjusting for congestion. Thus, whereas before it allowed 10 packets to be in transmission simultaneously, TCP will now only allow 5 packets to be in transmission at any one time. This obviously reduces the amount of traffic the network is being asked to carry.
Conversely, to compensate for such window reductions, TCP is also configured to increase the size of the window if no congestion is being encountered. However, when the window size is increased, it is not doubled, but merely incrementally increased by one. For example, if congestion had caused TCP to decrease the window size to five packets in transmission, then, after receiving the acknowledgments of a window's worth (5) of packets delivered successfully in a row, TCP will increase the size of the packet transmission window to six. This procedure repeats allowing the window size to continue to grow over time as long as no problems are encountered transmitting information packets.
In such a case, the traffic pattern may begin to "cycle." As packet windows increase, the amount of traffic will slowly increase until the resource reaches capacity. For a brief period thereafter, the resource will operate at its capacity, but not for long, because as packet windows continue to increase, the extra packets outstanding have the effect merely of filling the queue. At that point, the queue overflows, packets are lost or corrupted, and the transport protocol reduces the window size and, consequently, the traffic volume. With the traffic reduced, the windows will begin increasing again until the network resource queue overflows and the transport protocol reduces window size again. Thus, a cycle is exhibited.
Because transport protocols drastically reduce their window size in response to the slightest indication of congestion, and only increase the diminished window size through small, periodic increments, there may be many file transfers that users are attempting to make (i.e., a heavy demand for the network), but the network may appear uncongested because the transport protocols have reduced the traffic allowed. The network resource thus operates below capacity a majority of the time while the window sizes are growing, since a resource will be carrying traffic at its capacity for only a short time before congestion is indicated to the transport protocol and the window size is reduced. If the administrator monitors only the volume of traffic through the resource, it will appear that the resources are operating below capacity a majority of the time. This will almost certainly be misinterpreted by the network administrator to mean that the network is adequately meeting the demands placed on it.
Other Consequences of Camouflaged Congestion
Another aspect of camouflaged congestion is user frustration. When the network becomes congested and the transport protocol starts cutting information packet window size, a network user may wait an unreasonable amount of time to complete a file transfer. This typically leads to the user becoming frustrated and abandoning his attempts to use the network. Frustrated users represent demand for the network that cannot be measured by looking at the amount of traffic the network is carrying. Again, by simply measuring the past traffic volume through a network resource, a system administrator does not get an accurate picture of how much traffic there might have been if the network owner had actually supplied resources with greater capacity.
For example, a user may be browsing the World Wide Web. A Web user typically clicks on a button to request a home page, which is basically a rather large file, from another location. The requested page gets transferred from where the page resides to the screen of the user. The information flow required to transmit the page is governed by TCP.
If the Web user were located in Philadelphia and requested the home page of the White House located in Washington, the White House home page will be transferred across the network to the requesting user. If that transfer were to encounter congestion, the transport protocol could reduce the speed of the transfer. If this were to occur, the user could wait several minutes instead of seconds to find out what the White House home page was saying that day.
Experiencing such a delay in response to each request, a user may get frustrated and stop using the network. This is thus an effect of camouflaged congestion. The network administrator looking at the network traffic report would see one screen transferred from the White House to the user and may conclude that there was no other user demand. In reality, if the response time had been shorter, allowing the user to continue without undue delays, he or she would likely have continued to use the network for an indefinite period by accessing additional screens and information. Therefore, if the network administrator equates traffic volume with demand, he or she will be seriously misled.
Accordingly, the present invention is intended to address these problems and provide the network administrator with the best possible information about the actual demand for the network.