Optimised bit allocation is an important issue in video compression to increase the coding efficiency, i.e. to make optimum use of the available data rate. Achieving the best perceptual quality with respect to the limited bit rate is the target of optimised bit allocation. In view of the human visual system, a human usually pays more attention to some part of a picture rather than to other parts of that picture. The ‘attention area’, which is the perceptual sensitive area in a picture, tends to catch more human attention, as is described e.g. in L. Itti, Ch. Koch, E. Niebur, “A Model of Saliency-Based Visual Attention for Rapid Scene Analysis”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 20, No. 11, November 1998. Therefore optimum bit allocation based on different perceptual importance of different attention picture areas is a research topic in video compression technology.
E.g. the macroblock-layer bit rate control in MPEG2 and MPEG4 selects the values of the quantisation (following transformation) step sizes Qstep for all the macroblocks in a current frame so that the sum of the bits used for these macroblocks is close to the frame target bit rate B. MPEG2 and MPEG4 support 31 different values of Qstep. In MPEG4 AVC a total of 52 Qstep values are supported by the standard and these are indexed by a quantisation parameter value QP. Ch. W. Tang, Ch. H. Chen, Y. H. Yu, Ch. J. Tsai, “A novel visual distortion sensitivity analysis for video encoder bit allocation”, ICIP 2004, Volume 5, 24-27 Oct. 2004, pp. 3225-3228, propose a description for the visual distortion sensitivity, namely the capability for human vision to detect distortion in moving scenes, while the bit allocation scheme is very simple, by using the formula:QPN=QP+(1−VDSi,j/255)*Δq  (1)where QP is the initial quantisation parameter assigned by the rate control, Δq is a parameter for limiting the modification of QP, VDSi,j is the visual distortion sensitivity value of the (i,j)th macroblock (denoted MB) in a picture, and QPN is the refined quantisation parameter. This bit allocation scheme is rough and simple and does not consider an accurate bit rate control and distortion distribution control.
In S. Sengupta, S. K. Gupta, J. M. Hannah, “Perceptually motivated bit-allocation for H.264 encoded video sequences”, ICIP 2003, a picture is divided into foreground and background area, and then a target distortion Dtar is pre-decided for foreground quality without guarantee for the background quality. The quality of the background varies as a function of the distance from the foreground. The rate of degradation is controlled by a visual sensivity factorS=e−d/a  (2)where a is a constant controlling the rate of fall of the background degradation and d is the distance of a background pixel from the nearest foreground pixel. This scheme tries to give a distortion distribution consistent with the human visual system, while its performance suffers from the accuracy of the used model, and it does not explain how to get Dtar and to keep the quality degradation according to given equations under a pre-determined bit budget.
In order to solve the problem of optimised bit allocation with bit budget constraint, typical bit allocation algorithms are based on a Rate-Distortion optimisation with Lagrangian multiplier processing which can be described as a constraint optimisation problem to minimise the total distortion D with the constraint rate R less than RT, using an expression like:
                              min          ⁢                                          ⁢          J          ⁢                      :                    ⁢          J                =                              D            +                          λ              ×              R                                =                                                    ∑                                  i                  =                  1                                N                            ⁢                              D                i                                      +                          λ              ×                                                ∑                                      i                    =                    1                                    N                                ⁢                                  R                  i                                                                                        (        3        )            where Di and Ri are the distortion and the bit rate of each unit i (MB or attention area).
Assuming that the rate and distortion of each MB are only dependent on the choice of the encoding parameters as described above, the optimisation of equation (3) can be simplified to minimise the cost of each MB separately:minJi:Ji=Di+λ×Ri  (4)
It has been proposed to use an optimised bit allocation scheme based on modifying the Lagrange multiplier in the coding mode decision of each MB according to formula:λ′=α×λ  (5)where α is a scaling factor for modifying the Lagrange multiplier according to different levels of perceptual importance.
It has also been proposed to add a different weighting factor Wi to the distortion of different attention areas to perform optimised bit allocation:
                              min          ⁢                                          ⁢          J          ⁢                      :                    ⁢          J                =                                            ∑                              i                =                1                            N                        ⁢                                          w                i                            ×                              D                i                                              +                      λ            ×                                          ∑                                  i                  =                  1                                N                            ⁢                              R                i                                                                        (        6        )            whereby the rate and distortion model can be deduced based on a ρ-domain bit rate control model like this one:Ri=Aρi+B Di=384σi2e−θσi/384  (7)
Equations (7) can be put into equation (6) to get the optimised solution for bit allocation.
It is also known to use a Rate-Distortion model based on Gaussian distribution to get an optimum result for bit allocation, as shown in formula (8):Di=σi2×e−γRi  (8)