The demand for higher data rates in wireless communications is ever increasing. Thus, one has to find ways to use the given resources even more efficiently. Gains can be achieved by exploiting temporal variations in the channels due to fading that is independent among the users, so-called multi-user diversity. Opportunistic resource allocation (scheduling) was introduced in [1]. Well recognized work in this field is [2] and [3]; an overview can be found in [4]. The drawback of these schemes, purely aiming at increasing throughput, is the unfairness and starvation of users. So one seeks a balance between maximizing throughput and having a fair resource allocation among the users.
Proportional fairness offers an attractive trade-off between resource efficiency by opportunistically exploiting time-variant channels and the satisfaction of the users. Proportional fair sharing (PFS) was introduced in [5, 6] for the Qualcomm High Data Rates system.
The PFS is designed for a single channel network with TDMA constraint, that is, only one user is allowed to transmit at the same time. An extension to a system with multiple channels, with equal power per carrier, is introduced in [7]. And a similar but less general approach specifically designed for the 3GPP LTE Uplink is [8]. In the following, systems that allow only a single user per resource block are called orthogonal access systems.
Further increase in spectral efficiency for future generation networks is established by advanced physical layer techniques, for example multi-user MIMO. In multi-user systems with adaptive modulation and coding, the data rates of the users are coupled and in theory infinitely many rate configurations can be provided. These systems are referred to as advanced multi-user systems. The complex interdependence of the user rates is a significant difference and unfortunately there is no straight-forward extension of the PFS rule to advanced multi-user systems. A step to design opportunistic and fair resource allocation for multi-user systems is the formulation as an optimization problem; for proportional fairness this is the maximization of the sum of logarithmic average user rates [9]. For the PFS algorithm the interpretation as utility maximization and proof for asymptotic optimality can be found in [10]. To formulate the utility maximization some assumptions and definitions are introduced to describe the system model.
System Model: Slotted time-varying wireless channels are assumed, where the channel is assumed to be static within one time-slot. The channel state H is a random process and H[T] is the channel state realization at time-slot T. A peak power constraint is assumed, which implies that power budgets cannot be exchanged among the time-slots, as for an average power constraint. Depending on the capabilities of the hardware, the set of achievable data rates for the set of users K, K=|K| at time-slot T are given by the rate region R(H[T])=R[T]. The instantaneous rates established in time-slot T are r[T] ▴ R[T]. The weighted sample mean of the data rates is
            r      _        ⁡          [      T      ]        =            ∑              t        =        0            T        ⁢                  w        t            ⁢                        r          ⁡                      [                          T              -              t                        ]                          .            
The weights can be used to establish various definitions of the average throughput, see FIG. 10c. The long-term average rate is
            r      _        =                            lim                      T            ->            ∞                          ⁢                              r            _                    ⁡                      [            T            ]                              =                        lim                      T            ->            ∞                          ⁢                              ∑                          t              =              0                        T                    ⁢                                    w              t                        ⁢                          r              ⁡                              [                                  T                  -                  t                                ]                                                          ,in case the weights and the stochastic process of the channel states are such that the limit exists. This allows to define a region of long-term average rate regions supported by the physical layer:
      R    _    =            {                                                  r              _                        :                          r              _                                =                                    lim                              T                ->                ∞                                      ⁢                                          ∑                                  t                  =                  0                                T                            ⁢                                                w                  t                                ⁢                                  r                  ⁡                                      [                                          T                      -                      t                                        ]                                                                                      ,                              r            ⁡                          [              t              ]                                ∈                                    R              ⁡                              [                t                ]                                      ⁢                          ∀              t                                          }        .  
With these definitions one can state opportunistic and fair resource allocation as maximizing a utility of the long-term average throughput:
                                          maximize                          r              _                                ⁢                      U            ⁡                          (                              r                _                            )                                      ⁢                                  ⁢                                            subject              ⁢                                                          ⁢              to              ⁢                                                          ⁢                              r                _                                      ∈                          R              _                                ,                                    (        1.1        )            where the utility associated with proportional fairness is U( r)=ΣkεK log( rk).
The optimal long-term average throughput r* is the weighted sample mean of the optimal rate allocations r*[t] ε R[t] in each time-slot. Problem (1.1) is convex in the rate space and can be solved by suitable algorithmic methods. But at time-slot t one has to make a decision for r[t] while the future rate regions R[τ], τ>t are not known and the previously made decisions cannot be altered, i.e., the rate vectors r[τ], τ<t are fixed. Thus, one cannot calculate the optimal average throughput r* to find the optimal rate allocation r*[t] for the current time-slot.
This means: one cannot optimize the average throughput directly. Instead, one decides for a rate allocation r[t] in each time step, which then automatically results in a certain average throughput.
The goal is to find a close to optimal causal scheduling strategy for any time-slot t which only utilizes information about previously made decisions and previous channel state information which defines the rate regions. Under certain conditions the following policies are asymptotically (T→∞) optimal:                Gradient Method [11, 12] The rate configuration for the current time-slot t is based on maximizing a linear approximation of the utility:        
                              r          ⁡                      [            t            ]                          =                  arg          ⁢                                    max                              r                ∈                                  R                  ⁡                                      [                    t                    ]                                                                        ⁢                                                  ⁢                          Δ              ⁢                                                          ⁢                                                U                  ⁡                                      (                                                                                                                        r                            _                                                    ⁡                                                      [                            t                            ]                                                                          T                                            ⁢                      r                                        )                                                  .                                                                        (        1.2        )            
For the proportional fairness utility we have
                                          ∇                          U              ⁡                              (                                                                            r                      _                                        k                                    ⁡                                      [                    t                    ]                                                  )                                              =                                                    ∂                                  log                  ⁡                                      (                                          r                      k                                        )                                                                              ∂                                                                            r                      _                                        k                                    ⁡                                      [                    t                    ]                                                                        =                          1                                                                    r                    _                                    k                                ⁡                                  [                  t                  ]                                                                    ,                            (        1.3        )            which leads to the well known PFS rule [5,6] in case of a TDMA constraint, where a single user needs to be selected. Therefore the gradient method can be considered as a generalization of proportional fair sharing for orthogonal access systems to proportional fair resource allocation for advanced multi-user systems.                Stochastic Subgradient Method Another attempt to solve problem (1.1) causally is the stochastic subgradient method. The rate configuration of the current time-slot is        
      r    ⁡          [      t      ]        =                              argmax                      r            ∈                          R              ⁡                              [                t                ]                                                    ⁡                  [          t          ]                    T        ⁢    r  where λ[t] are the dual variables updated as follows
                                          a            ⁡                          [              t              ]                                =                                                    argmax                a                            ⁢                              H                ⁡                                  (                  a                  )                                                      -                                                            [                  t                  ]                                T                            ⁢              a                                      ,                                  ⁢                                          ⁡                          [                              t                +                1                            ]                                =                                    [                                                [                  t                  ]                                -                                  α                  ⁡                                      (                                                                  r                        ⁡                                                  [                          t                          ]                                                                    -                                              a                        ⁡                                                  [                          t                          ]                                                                                      )                                                              ]                        +                          ,                            (        1.3        )            with a fixed constant α.                Methods from Queuing Theory The task of optimizing a network utility is also considered in the area of queuing networks [13, 14] and virtual queues can be used for allocating resources in a way that leads to an optimal solution with respect to the network utility.        
The rate configuration for the current time-slot is
            r      ⁡              [        t        ]              =                  argmax                  r          ∈                      R            ⁡                          [              t              ]                                          ⁢                        u          ⁡                      [            t            ]                          T            ⁢      r        ,where u[t] is the virtual queue updated as follows
                                          a            ⁡                          [              t              ]                                =                                                    argmax                a                            ⁢              β              ⁢                                                          ⁢                              U                ⁡                                  (                  a                  )                                                      -                                                            u                  ⁡                                      [                    t                    ]                                                  T                            ⁢              a                                      ,                                  ⁢                              u            ⁡                          [                              t                +                1                            ]                                =                      [                                          u                ⁡                                  [                  t                  ]                                            -                              r                ⁡                                  [                  t                  ]                                            +                                                a                  ⁡                                      [                    t                    ]                                                  +                                      ]                          ,                            (        1.4        )            
with a fixed constant β.
The work in [15] is mentioned that specifically treats multi-user MIMO, but does consider an average power constraint and can therefore not be applied to the present scenario without major modifications.
The algorithms are memoryless, in the sense that they do not require keeping track of the rate allocations in the past or channel states. Instead, they track a single variable per user, the current average rate, a dual variable, or the queue length, which is cheap to store and simple to update. They assume that the mobile services have a high tolerance for delay and that user positions and activity of users varies only slowly. Establishing long-term fairness by means of the methods described may lead to unacceptable periods without service for some users.
An extreme way to avoid this, is to establish fairness in each of the time-slots, for example for
                                          r            ⁡                          [              t              ]                                =                                    argmax                                                r                  ⁡                                      [                    t                    ]                                                  ∈                                  R                  ⁡                                      [                    t                    ]                                                                        ⁢                          {                                                ∑                                      k                    ∈                    K                                                  ⁢                                  log                  ⁡                                      (                                                                  r                        k                                            ⁡                                              [                        t                        ]                                                              )                                                              }                                      ,                            (        1.5        )            or max-min fairness
                              r          ⁡                      [            t            ]                          =                              argmax                                          r                ⁡                                  [                  t                  ]                                            ∈                              R                ⁡                                  [                  t                  ]                                                              ⁢                                    {                                                min                                      k                    ∈                    K                                                  ⁢                                  {                                                            r                      k                                        ⁡                                          [                      t                      ]                                                        }                                            }                        .                                              (        1.6        )            
As the current rate region is known, the maximization can be efficiently solved by suitable methods.
However, establishing a fair resource allocation in each time-slot independently may be too restrictive and lead to a loss in efficiency. Depending on the application, several consequent timeslots without service might be acceptable, but service needs to be provided within a fixed time window. A possible solution is predictive scheduling [16-22].
The idea is that, although they might be erroneous, estimates of future channel states might be beneficial. The resulting schedulers are no more memoryless and in general regard a certain horizon of past rate allocations (look-behind) and predictions of future channel states (look-ahead). For this time frame they maximize a utility or the expectation of the utility over several subsequent (potentially overlapping) time frames. So the gain of predictive scheduling comes at the price of having higher computational complexity.
For orthogonal access systems there is a direct connection between the data rate of the user and the channel state. This is no more true for advanced physical layer techniques, for example MU-MIMO, where by choosing the transmission strategies, for example transmission powers or beamformers, a trade-off between the user rates can be made. State of the art methods for predictive scheduling [16-22] are intended for orthogonal access systems and do not generalize to advanced multi-user systems.
Hence, for complex systems, the well-known methods are either too complex or too computationally expensive or do not result in the optimum solution with respect to a certain utility, such as a fair allocation utility.