The protocols used at the various logical levels of the network partially determine the treatment that packets receive as they traverse the network. For example, at the logical level in the network defined as the internetworking layer by the ISO Open Systems Interconnect model, a popular protocol is the Internet Protocol, commonly referred to as IP. IP serves to ensure interoperation between networks that use different protocols at the lower logical layers and has become hugely successful, as evidenced by the global Internet. IP-based networks were originally designed to provide “best-effort” service, in which all packets receive similar transport treatment, and to provide a connectionless architecture that uses transport-layer protocols, such as TCP (Transmission Control Protocol), to ensure that network resources are shared fairly among all the packet flows.
Best-effort service makes no performance guarantees as to the transport time of packets (delay) or the throughput between hosts, nor does IP guarantee that packets will arrive at their destinations, as the IP protocol permits switches and routers to discard, or drop, packets if the packet queues servicing the communications links become full. Many popular applications, e.g., file transfer and e-mail, typically do not have strict requirements on transport delay and throughput, and such applications and their associated packet flows are commonly referred to as elastic applications and elastic flows, respectively, because they can tolerate variances in delay and throughput. Often, however, these applications cannot tolerate packet loss, so higher-level transport protocols such as TCP are used to detect and retransmit dropped packets and to control the packet throughput such that congestion and packet-drop probabilities are reduced.
The technological successes of IP networking, as well as compelling business and financial factors, have sourced a migration of many other applications from non-IP networks to IP networks, a phenomenon called convergence. Many of these migrating applications, such as voice and video teleconferencing, are inelastic, i.e., they have strict requirements on the transport performance, or Quality-of-Service (QoS), provided by the network, with respect to delay, packet loss probabilities, and throughput that must be met, but that may not be met because of the lack of guarantees provided by a best-effort service. Differentiated services are needed to augment best-effort service, but in a converged environment mixing elastic and inelastic applications on a shared IP infrastructure and providing differentiated services for different classes of traffic with different QoS performance needs has proven to be difficult, for several reasons.
One of the primary reasons is that conventional IP networking results in stochastic packet transport processes, and furthermore in a class of stochastic processes commonly referred to as “bursty.” Analysis methods for bursty processes are complex and not as well understood as some other common stochastic processes, such as Poisson processes, observed in some non-IP networks. The pragmatic side effect of bursty processes is that it is difficult for designers and operators of IP networks to efficiently allocate networking resources to support differentiated services, e.g., to select packet buffer sizes in switches and routers that will meet delay and drop probability requirements, or to select links with sufficient capacities to handle anticipated traffic loads while maintaining required QoS performance metrics.
A simple and often-used approach to overcoming the difficulties is to overprovision the links, switches, and routers, i.e., purchase more-than-sufficient link capacity, and purchase switches and routers with sufficient processing power, non-blocking architectures, and buffer resources to drive packets through the links at the line rate and with low packet drop probability. Often, Moore's Law makes switch/router overprovisioning economically feasible. In some cases, e.g., wired Local Area Networks (LANs), such as 1G or 10G Ethernet LANs, it is also economically feasible to overprovision communications links. For wide-area network (WAN) interconnection and access links, however, overprovisioning is often not economically feasible and possibly not available from interconnectivity providers. For example, the capacity of access links is often lower by several orders of magnitude than the capacities of the LANs and carrier/service provider networks that they interconnect. Similarly, wireless LANs (WLANs) and wireless WANs (WWANs) have capacity restrictions and are not readily overprovisioned. In the IP network context, access links and wireless links often behave as bottleneck links, which means that often the temporal packet traffic load placed on them exceeds capacity. During such episodes, packets are queued, thereby incurring delay and loss and an associated reduction in the QoS provided to applications by the network.
When overprovisioning of network links is not feasible, then one approach to providing sufficient QoS is to limit the traffic load placed on the link. Again, because of bursty traffic processes and the behavior of conventional IP networks, however, it is difficult to enforce load limiting without adversely affecting QoS and without aggravating endpoint users. Furthermore, it is difficult economically to justify reserving large amounts of spare capacity. Accordingly, those skilled in the art often measure the efficiency of links as the ratio of the maximum allowable load to the link capacity, relative to some QoS performance metric, and commonly refer to this ratio as the density. Sometimes the density is expressed in terms of a target application, e.g., in a voice call networking environment such as that of an enterprise IP Private Branch Exchange (IP PBX), the call density measures the ratio of the maximum number of high-QoS calls supported by a link to the theoretical maximum obtained by dividing the link capacity by the required bandwidth of a call. In any case, ideally a density measure is 1, or 100%, but in practice density values of 0.2-0.5 are typical. Even with low density values, QoS guarantees are difficult to make because of the burstiness of the packet traffic processes.
Thus, convergence to IP networks has sourced a need for differentiated transport services. Providing differentiated services while maintaining required QoS performance is difficult, primarily because of the burstiness of packet traffic processes. Overprovisioning, the often-used method for supporting differentiated services and QoS, results in poor efficiency/low densities and still does not provide QoS guarantees.
A need therefore exists for methods and systems that structure and shape packet traffic such that aggregate traffic in a converged IP networking environment appear as non-bursty, near-deterministic processes to the switching infrastructure. The network could then behave as if it were switching homogeneous traffic, which is known both analytically and empirically to afford better densities and QoS performance than that afforded by heterogeneous traffic. A further need exists for methods and systems for scheduling packet service times by bottleneck links that is simple and readily implemented in existing switches and routers, as well as wireless links and wireless access networks, such as those used in 802.11 WLANs.