In the design of a video encoder, there is usually the issue of how to specify the coding bit rate objective. This is typically a trivial issue in conventional single view video coding scenarios. For example, usually one just needs to specify the average and maximum target coding bit rates of the coded video, denoted with Ravg and Rmax, respectively. However, in commercial video coding applications involving multi-view coding (MVC), bit rate configuration is a design task that demands much more scrutiny.
Multi-view video coding (MVC) is the compression framework for the encoding of multi-view sequences. A Multi-view Video Coding (MVC) sequence is a set of two or more video sequences that capture the same scene from a different view point.
In multi-view video coding, there is one video view that is referred to as the base view. The base view represents conventional two-dimensional (2D) video coding scenarios, and serves conventional 2D video applications such as, for example, 2D movies and televisions, and so forth. In addition, there are one or more views referred to as dependent views. Dependent views are shot for the same scene from various different angles, to support multi-view video applications such as, for example, three-dimensional (3D) movies and televisions and so forth. In the MVC extension of the International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) Moving Picture Experts Group-4 (MPEG-4) Part 10 Advanced Video Coding (AVC) Standard/International Telecommunication Union, Telecommunication Sector (ITU-T) H.264 Recommendation (hereinafter the “MPEG-4 AVC Standard”), the base view is coded in the same way as the 2D MPEG-4 AVC Standard coding case. However, dependent view coding can benefit from the coded based view video frames via a new macroblock (MB) coding mode called inter-view prediction, and hence, yield much better compression efficiency than that of the base view coding.
Therefore, in these new MVC scenarios, there exist bit rate configuration issues. For example, in practice, variable bit rate multi-view video coding implies applications such as 3D movie compression onto Blue-ray Discs (BD), and so forth. In this scenario, first one needs to decide whether to respectively specify Ravg and Rmax for each individual view or to specify Ravg and Rmax for the joint views (i.e., the former with each view considered separately or the latter with all the views considered together). Enforcing bit rate requirements for each single view can guarantee the coding quality of each view in advance. However, such enforcement precludes the encoder from exploring the globally optimal bit allocation among all the views, which could have provided a better overall coding performance had the same been explored and used. On the other hand, enforcing only the bit rate requirements for the joint view only allows for global optimal rate control across views. However, the result cannot guarantee that some particular views meet some particular quality constraint. Especially in practice, an explicit target bit rate requirement of the base view is highly desirable to guarantee the quality of the base view coding in order to not compromise the quality of existing conventional 2D video services which only rely upon the base view 2D view.
As previously stated, in conventional single view (i.e., 2D) video coding scenarios, bit rate configuration is mostly a trivial issue in the encoder design. Typically, one just needs to specify the Ravg and Rmax of the coded video. In constant bit rate (CBR) coding situations, such as video broadcasting or streaming over networks, Ravg is often directly determined by the limited transmission channel bandwidth. In variable bit rate cases, such as video storage applications, Ravg can be easily derived from the total storage space and the total play-out time of the input video. Rmax is a constraint mainly for the purpose of multiplexing the coded video bitstream with other related data streams such as, for example, coded audio or other coded video streams, for the overall system output. In that case, the coded video needs to be properly constrained with Rmax such that all the data streams can be successfully multiplexed together as one single output stream of the whole application system. Otherwise, Rmax is not a desirable coding constraint, as a limited Rmax in turn limits the capability of the encoder to achieve consistent coding quality across all the video frames which may lead to a compromised overall subjective quality experience.
However, with the presence of multiple views of video, the new multi-view video coding scenarios make bit rate configuration a more difficult and important problem for MVC encoder design, for which there is no one widely accepted solution. To account for multiple views, one choice is to specify Ravg and Rmax for each individual view respectively. In this way, the encoder will strictly adhere to these requirements and thus render coded video at guaranteed levels of quality as prescribed at each view. The problem with this scheme is that these requirements are determined before encoding. Thus, these requirements are likely not the global optimal configuration, since such global optimal configuration can only be found out somewhere within the encoding process. For example, after some sort of necessary analysis of the whole video sequence, there is enough information for the encoder to make estimates of coding complexities of each view. Then, the encoder may carry out some well designed rate control algorithm to figure out the global optimal bit allocations for all the views, which meet with the joint view Ravg and Rmax requirements while at the same time maximize the perceptual quality of the coded multi-view video. Therefore, for the sake of global optimality, the other possible choice of multi-view video coding bit rate configuration is to only specify the joint view Ravg and Rmax. However, this scheme loses all the single view quality guarantees. Especially for the base view, a quality guarantee via Ravg and Rmax is highly desirable such that the new multi-view video coding result with a global optimal bit allocation across all the views will not yield much compromised coding quality of the base view. In practice, this is important, because all the existing conventional 2D video consumers will only see the base view video. As a service provider, satisfactory performance of services has to be provided for existing customers as well as new customers.