A service provider's network is often comprised of several customers utilizing a variety of services delivered through individual transmission media. Often, the individual connections, or downlinks, between the customer premises and the service provider are combined into one or more uplinks. The uplinks form the physical connection to the service provider's network, whereby traffic is both transmitted and received from the customers. Traffic congestion can be a common challenge experienced by service providers in networks utilizing uplinks. In this situation, the rate and or volume of the aggregate incoming customer traffic exceeds the egress capacity of the network equipment. Using rate limiting, a service provider can limit the aggregate bandwidth at the network ingress. By setting a maximum allowed traffic rate and or volume entering a specific port, the service provider ensures that each customer has access to the agreed upon bandwidth stated in their service level agreement (SLA).
The SLA executed between the service provider and the customer establishes the terms of the relationship between the two parties. The SLA describes the services to be provided and the manner in which those services will be delivered. Prior to provisioning a service, both the service provider and the customer mutually define the nature of the data transmission rate for a particular service. The SLA typically defines data transmission parameters that govern the customers' transmission of data to the service provider, such as committed information rate (CIR), committed burst size (CBS), and excess burst size (EBS). If the subscriber transmits data according to the CIR, CBS, and EBS guidelines set forth in the SLA, the service provider will attempt to deliver the information according to its obligations. In a network with several customers, each with different SLAs, a service provider must ensure that it complies with the data transmission requirements of each agreement. Therefore, a service provider must have the ability to track the rate and volume of traffic entering and exiting its network at any given port in order to ensure that each customer receives no more than the agreed upon bandwidth. Rate limiting is one approach used to enforce bandwidth consumption. Traditional rate limiting provides a mechanism to determine if a subscriber is conforming with the agreed upon bandwidth consumption requirements and a process to determine what actions need to be taken if a subscriber violates the bandwidth consumption requirements.
A common solution for rate limiting uses a traffic bucket for a given port. A traffic bucket operates by placing the incoming network traffic in a queue. The queue delays the incoming traffic and releases it into the service providers network at a fixed rate. Often, a bucket is assigned to a specific port at the ingress of the service provider's network and is used to monitor traffic at an aggregate level.
However, the types of Internet services available are expanding beyond traditional data services. Traditional data services are primarily time delivery insensitive. However, real-time services such as Voice over IP (VoIP), IPTV, and gaming are extremely sensitive to delay and service interruptions. Network congestion can create interruptions to video conferencing and VoIP services that are very noticeable to the end user. Moreover, interruptions to real-time services can have a proportionately larger impact on the quality of the service as compared to traditional data services. The network equipment must be able to distinguish the type of information entering the network in order to deliver real-time services with a reasonable quality of service. Thus, real-time applications require network equipment capable of intelligent rate limiting schemes that are application-aware in order to prioritize the delivery of specific classes of traffic.
It is well recognized by those skilled in the art that Layer 4 of the OSI protocol stack defines the session layer. The session layer serves as the primary communication mechanism between the actual customer application and the lower-level hardware centric physical layers. Transport control protocol (TCP), user datagram protocol (UDP), and real time transport protocol (RTP) are examples of transport, or layer 4, protocols that directly interface with applications such as file transfer program (FTP), streaming media applications, and VoIP applications.
There are, however, fundamental operational differences between the various layer 4 protocols. UDP, for example, is a connectionless protocol. TCP, on the other hand, is a connection oriented protocol, which is often regarded as more reliable than UDP. In the case of TCP, the receipt of each byte is acknowledged by the receiver, which is then communicated back to the sender. These acknowledgement packets, referred to as TCP-ACK packets, create interesting traffic dynamics in the context of port-based rate limiting. In order to achieve data transmission throughput for applications using TCP and other layer 4 protocols, having a rate limiting scheme that is aware of the traffic patterns and distribution of data and control packets is essential.
A service provider often faces the challenge of controlling the information rate received by the customer. For example, traffic exceeding an agreed upon service level agreement between the service provider and the client can be managed based on the TCP port number. This intelligent method of performing an action based upon traffic exceeding an SLA is sometimes referred to a hierarchical rate limiting.
In other cases, hierarchical rate limiting is based on traffic priority. For example, a service provider may allow 2 Mbits per second of Priority3 traffic, 700 Mbit per second of Priority2 traffic, 1 Mbit per second of Priority1 traffic and 512 Kbits per second of best effort traffic. It is well recognized by those skilled in the art that it is common for customers to expect that a given information stream has more best effort traffic that either Priority3, Priority2 or Priority1 traffic.
Intelligent rate limiting schemes may help service providers ensure real-time applications are delivered without delay. However, some intelligent rate limiting schemes do not provide a mechanism for lower priority traffic classes to use available bandwidth capacity allocated to real time services. In situations where a subscriber uses a mix of real-time and traditional data services, a rate limit hierarchy can be used to enable lower priority traffic to use unused bandwidth allocated to real-time functions when real-time traffic is not flowing. It would be desirable to provide a method and apparatus that adds intelligence to a service provider's network by rate limiting the ingress ports using a hierarchy of rate buckets to apply a common rate limit to several classes of service, thus enabling them to share available bandwidth in order to achieve the final information rate expected by the customer.