The present invention relates to bandwidth allocation method and apparatus and, more particularly, but not exclusively to such allocation in cases of oversubscription, for bottleneck management.
Network service providers typically provide network access to a large number of service subscribers. Each subscriber expects to receive a minimal bandwidth allocation for purposes such as Internet browsing, file downloads and uploads, or interactive communications such as voice or video calls. On the other hand, network service providers intentionally allocate or provision a limited bandwidth resource to serve all subscribers. In many situations, this bandwidth resource is less than the maximal aggregate bandwidth that all subscribers may theoretically consume simultaneously. This conduct is known in the art as oversubscription.
Oversubscription is based on the statistical knowledge that at any given time only a subset of subscribers actually consume network bandwidth to the extent they are entitled to. As a result, when all or most of the subscribers conform to the statistical model, they typically perceive the bandwidth allocated to them as sufficient to serve the bandwidth they are entitled to. However, sometimes, and more often in recent years, a small number of subscribers consume all the bandwidth they are entitled to over extended periods of time. Typically, such peak consumption is due to the use of peer to peer file sharing applications such as BitTorrent or eMule. The extensive consumption of bandwidth by a subset of subscribers violates the statistical model which is the foundation of oversubscription and results in the degradation of service that the service providers can provide to the majority of their subscribers, who still conform to the statistical model of usage. As a result, network service providers require means for limiting the excessive consumption of bandwidth by any given subscriber.
Until recently, the common measures used for limiting overconsumption of bandwidth by subscribers were based on classifying network traffic consumed by subscribers, identifying traffic related to applications known for their heavy bandwidth consumption, such as the abovementioned peer to peer file sharing applications, and limiting the bandwidth that such applications are allowed to consume. In addition, service providers are known to have administratively limited over consumption of bandwidth by terminating or threatening to terminate service contracts with such subscribers.
Following federal rulings in the USA, the practice of traffic classification as the basis for overconsumption control has become illegal. As a result, new means for management are required that are agnostic to the applications consuming the bandwidth. Such means are required to limit overconsumption of bandwidth by ensuring fair use of bandwidth by subscribers. There are various definitions for fair use but all of them attempts to define fair bandwidth usage by subscribers, namely a subscriber entitlement to consume bandwidth in a way and to a limit that does not impair the ability of other subscribers to have fair access to bandwidth.
The typical goal of a fair access policy enforcement mechanism is to ensure that each subscriber has the ability to consume a minimal transmission rate while the remaining bandwidth is fairly divided between all subscribers actively transmitting traffic into the access network, possibly taking into account the maximal rate each subscriber is entitled to. Such a mechanism may drop, forward or mark traffic as associated with a given traffic priority. The act of dropping or marking traffic as a means for managing the bandwidth consumed by a subscriber is known in the art as traffic policing. It is customary in the art to implement traffic policing by using token bucket rate limiters for estimating the actual rate of traffic flowing and enforcing a maximal rate.
As is known in the art, token bucket rate limiters may store tokens representing data to be transmitted, one token per each unit of data. Whenever a unit of data, such as a byte, is transmitted, a token is “cashed in”. The shaper has a maximum bucket size (which corresponds to a maximum burst size) and the number of tokens available at a given time corresponds to the current allowed number of data units that may be transmitted. If there are no tokens in the bucket, then no packets may be transmitted and an arriving packet may either be dropped or marked as eligible for dropping. Tokens are replenished based on the time that has passed since the last transmission of the rate limiter and the average rate it is allowed to transmit. Typically, the number of tokens added to the token bucket is the minimum between the maximum bucket size and the multiplication of the time elapsed since last replenishment and the allowed limiter rate in terms of data units per time unit.
An algorithm for a token bucket system may be conceptually understood as follows:
A token is added to the bucket every 1/r seconds.
The bucket can hold at the most b tokens. If a token arrives when the bucket is full, it is discarded.
When a packet (network layer PDU) of n bytes arrives, n tokens are removed from the bucket, and the packet is sent to the network.
If fewer than n tokens are available, no tokens are removed from the bucket, and the packet is considered to be non-conforming.
The algorithm allows bursts of up to b bytes, but over the long run the output of conformant packets is limited to the constant rate, r. Non-conforming packets can be treated in various ways:
They may be dropped.
They may be enqueued for subsequent transmission when sufficient tokens have accumulated in the bucket.
They may be transmitted, but marked as being non-conforming, possibly to be dropped subsequently if the network is overloaded.
Token buckets may be incorporated into equipment provided say at the user premises, or for a link as a whole.
It is customary in the art to prioritize different types of traffic, typically associated with different services such as voice, video or Internet access provided by a network service provider. In case of congestion in a network, high priority traffic such as voice is typically allowed to consume bandwidth before lower priority traffic such as Internet traffic may consume it. Thus, in a case of insufficient bandwidth, low priority traffic may be partially or entirely dropped while high priority traffic is forwarded. The above notion is known as traffic prioritization. Even a network that enforces fair use of bandwidth must still take traffic prioritization into account.
As is known in the art, pairs of token buckets may be combined to form what is known as dual token bucket rate limiters (or shapers) in order to mark packets with priority markings. For example, given one token bucket B 1 which rate limits to a rate R1 and another token bucket B2 which rate limits to a rate R2, where R1<R2, a packet may be assigned a high priority if it arrives at B1 when it holds enough tokens to allow its transmission. If B1 does not hold enough tokens to allow transmission but B2 does, the packet may be assigned a low priority. If both B1 and B2 do not hold enough tokens to allow transmission, the packet may be dropped or marked as eligible for dropping. Hence, each packet traversing a dual token bucket may be marked with one of three markings: High, Low or Eligible to Drop. Such a mechanism is also known in the art as a tri-color marker.
Mechanisms for traffic prioritization may make use of tri-color markers to enable forwarding of assured rates while dropping traffic exceeding an assured rate in the case of congestion.
Fair share use may be enforced at various points in service provider networks. Typically, one enforces fair use on traffic before it reaches a congestion point in the network. However, in some cases, it is problematic to position such enforcement mechanisms before congestion points (upstream to congestion points) due to cost or manageability issues. For instance, in access networks where subscribers are connected via a wireless medium to a service provider fixed network, the wireless medium may be a congestion bottleneck, but enforcing fare share at every subscriber premises may be unfeasible due to an inability to communicate with other subscribers in order to detect or calculate the fair share.
Most Internet traffic, including all TCP traffic, performs some form of congestion control. The goal of this congestion control is to maximize utilization of the possibly limited bandwidth resources while reacting to congestion by reducing bandwidth consumption. One way of reducing consumption is for example by reducing the TCP window size to ensure fair access to these resources. Such protocols are known to probe the resources for available bandwidth by increasing their consumption gradually until packet loss is detected. For instance, TCP traffic does this by increasing its window size. When packet loss is detected, such protocols typically reduce bandwidth consumption.