Packet data traffic is growing very quickly in mobile operator networks, in many cases it grows much more quickly than the rate at which the operator can expand its network capacity. This leads to more frequent occurrences of network congestion when the offered traffic is higher than what the RAN (radio access network) is able to fulfill. Also, new services appear often, which may lead to a situation when a new QoE requirement has to be introduced into the network quickly. In this situation, operators need efficient and flexible tools by which they can control how the bottleneck RAN capacity can be best shared so that they can maximize the Quality of Experience of their users.
The current 3GPP QoS is based on the bearer mechanism, e.g. as described in 3GPP TS 23.401 section 4.7.2. Traffic that requires differentiated QoS treatment is classified into bearers. For each bearer, the QoS Class Identifier (QCI) parameter determines the basic QoS treatment. A few other parameters, such as the Maximum Bitrate (MBR), Guaranteed Bitrate (GBR), UE or Access Point Name (APN) specific Aggregate Maximum Bitrate (AMBR) and Allocation and Retention Priority (ARP) parameters can further influence the quality of service applied to the bearer traffic.
The bearer based QoS has some limitations which has so far prevented its wide adoption. One limitation is that for 3G, the network based QoS mechanism requires the release-7 Network Initiated Dedicated Bearer (NIDB) support, which has so far not yet materialized in terminal equipment. Even though new NIDB enable terminals may come out, it may take a few years before they reach a sufficiently high penetration for operators to make efficient use of the feature.
As another limitation, the currently defined QoS parameters do not provide a predictable QoE in congestion situations. The GBR and MBR parameters only apply to GBR bearers while most of the traffic currently goes over non-GBR bearers. The AMBR parameter only allows enforcement of a maximum over several bearers which is not flexible enough to specify congestion behavior.
Moreover, in the context of the 3GPP UPCON (User plane congestion management) work item, a new type of solution has recently been put forward which utilizes congestion feedback from the CN to the RAN. This has e.g. been documented in 3GPP TR 23.705 version 0.3 section 6.1. When RAN indicates congestion to the CN, the CN can take actions to mitigate the congestion, e.g. such as limiting some classes of traffic. Congestion feedback as proposed so far is based on the measurement of the RAN load, i.e., resource utilization, and providing congestion feedback when the average load over a period of time exceeds a pre-defined threshold level. A main characteristic of such load-based congestion feedbacks is that it cannot differentiate in the congestion status once RAN is fully loaded. Load-based congestion feedback is illustrated in FIG. 10. Load-based congestion feedback considers all packets to be equal
Possible examples for load-based congestion feedback can be:                whether the radio resource utilization in the air interface exceeds 90% over a 10 sec averaging period;        whether the total sum of buffer lengths for all users averaged over 10 sec exceeds a pre-defined threshold.        
FIG. 11 is a schematic illustration of a functional split between RAN and CN with load-based congestion feedback. Functional steps such as “Load measurement” and “RAN QoS re-modeling” etc are indicated without considering which logical node these entities map to. In the CN, the load information is not directly useable to base CN action on, because very different QoS levels can all lead to similar load levels. Hence the CN re-models the RAN QoS based on the load-based congestion feedback and other information available on the CN about the actual QoS status in the RAN. The CN re-modeling tries to approximate the RAN congestion status, but it can never be fully accurate as the RAN fluctuations are very quick and unpredictable, e.g. due to changes in the radio channel quality and varying traffic mix. Thus, there is no direct relationship in the RAN between the realized RAN QoS and the congestion feedback based on load measurements. The result of the RAN QoS re-modeling is a measure of the QoS aware congestion level, or QoS level in short, which can then form a basis for potential CN action in accordance with operator QoS policy.
As already mentioned in the “Background” section, the bearer based QoS has some limitations, which has so far prevented its wide adoption, e.g. network based QoS mechanism requires the release-7 Network Initiated Dedicated Bearer (NIDB) support, which has so far not yet materialized in terminal equipment. Another limitation is that the currently defined QoS parameters do not provide a predictable QoE in congestion situations. The GBR and MBR parameters only apply to GBR bearers, while most of the traffic currently goes over non-GBR bearers. The Aggregate Maximum Bitrate (AMBR) parameter only allows enforcement of a maximum over several bearers, which is not flexible enough to specify congestion behavior.
Moreover, using a load-based congestion feedback from the RAN to the CN causes a number of disadvantages. Load-based congestion feedback considers all packets to be equal when it comes to congestion reporting. Even though two different packets may be part of traffic flows that represent highly different values to the user and the operator, they have equal role when it comes to load-based congestion feedback reporting. Load-based congestion feedback ignores the quality of experience observed by the user for a given traffic flow, and ignores the quality of service requirements that the operator sets.
There are a number of consequences of this simplistic congestion feedback reporting, of which some are summarized in the bullets below.                Unclear interpretation. The same congestion feedback is reported no matter whether the high load comes from a few happy file sharing users only, or from many unhappy premium users.        Complex processing in the CN. As a result of the unclear interpretation, the core network has to perform significant processing functions to arrive at possible actions that may be meaningful to the users and the operator. The CN has to re-model the congestion level experienced in the RAN to determine the QoS aware congestion level before it can determine whether any meaningful action could be taken.        Inaccurate QoS level. A CN based approximation of the RAN congestion status and consequent QoS aware congestion level estimation can never be fully accurate, as the load-based congestion feedback carries delayed and filtered information, and the changes in the radio channel and traffic mix can only be followed with a limited precision in the CN.        Heavy signalling. The load status in the RAN can change very frequently: even a small number of users can frequently cause high load due to traffic intensive applications (such as video on larger screens or big downloads) that can cause the RAN to be fully utilized for a period of time, after which the load can again drop. The load-based congestion feedback is therefore expected to result in a very high congestion feedback signalling load due to the frequent changes in load status. One way to decrease that signalling would be to use longer averaging periods, but that makes the solution inefficient, since the congestion feedback would be delayed and filtered to only very long congestion events. Also, oscillation problems are intensified with more delay in the feedback.        RAN under-utilization. A system which acts on a load based indication to mitigate the load is likely to oscillate around the threshold where RAN reports congestion. Upon a congestion indication from the RAN, the CN takes action to reduce the traffic, leading to lowered RAN load, eventually leading to a status change and congestion feedback from the RAN ends. That triggers the CN to stop or lessen the traffic reduction, which goes on until RAN eventually reports congestion once again and the process starts all over. The congestion reporting load threshold cannot be at 100% utilization, because an averaged load value would in most cases not reach 100%. But if the threshold is somewhere lower (e.g. at 90%) and the system targets that threshold, it leads to a (e.g. 10%) loss or capacity due to underutilization. Even if the threshold is higher, the oscillations due to CN mitigation actions that lead to the eventual avoidance of the congestion feedback would cause RAN resource under-utilization.        