Video data requires a lot of storage space to store or a wide bandwidth to transmit. Along with the growing high resolution and higher frame rates, the storage or transmission bandwidth requirements would be formidable if the video data is stored or transmitted in an uncompressed form. Therefore, video data is often stored or transmitted in a compressed format using video coding techniques. The coding efficiency has been substantially improved using newer video coding standard such as H.264/AVC and the emerging HEVC (High Efficiency Video Coding) standard. In order to maintain manageable complexity, an image is often divided into blocks, such as macroblock (MB) or LCU/CU to apply video coding. Video coding standards usually adopt adaptive Inter/Intra prediction on a block basis.
FIG. 1 illustrates an exemplary system block diagram for video decoder 100 to support HEVC video standard. High-Efficiency Video Coding (HEVC) is a new international video coding standard developed by the Joint Collaborative Team on Video Coding (JCT-VC). HEVC is based on the hybrid block-based motion-compensated DCT-like transform coding architecture. The basic unit for compression, termed coding unit (CU), is a 2N×2N square block. A CU may begin with a largest CU (LCU), which is also referred as coded tree unit (CTU) in HEVC and each CU can be recursively split into four smaller CUs until the predefined minimum size is reached. Once the splitting of CU hierarchical tree is done, each CU is further split into one or more prediction units (PUs) according to prediction type and PU partition. Each CU or the residual of each CU is divided into a tree of transform units (TUs) to apply two-dimensional (2D) transforms.
In FIG. 1, the input video bitstream is first processed by variable length decoder (VLD) 110 to perform variable-length decoding and syntax parsing. The parsed syntax may correspond to Inter/Intra residue signal (the upper output path from VLD 110) or motion information (the lower output path from VLD 110). The residue signal usually is transform coded. Accordingly, the coded residue signal is processed by inverse scan (IS)/inverse quantization (IQ) block 112, and inverse transform (IT) block 114. The output from inverse transform (IT) block 114 corresponds to reconstructed residue signal. The reconstructed residue signal is added to reconstruction block 116 along with Intra prediction from Intra prediction block 118 for an Intra-coded block or Inter prediction from motion compensation block 120 for an Inter-coded block. Inter/Intra selection block 122 selects Intra prediction or Inter prediction for reconstructing the video signal depending on whether the block is Inter or Intra coded. For motion compensation, the process will access one or more reference blocks stored in decoded picture buffer or reference picture buffer 124 and motion vector information determined by motion vector (MV) generation block 126. In order to improve visual quality, deblocking filter 128 and Sample Adaptive Offset (SAO) filter (130) are used to process reconstructed video before it is stored in the decoded picture buffer 124. For the H.264/AVC standard, only the deblocking filter (DF) is used without the sample adaptive offset (SAO) filter.
In addition to the H.264/AVC and HEVC video coding standards, there are also other formats being used such as Window Media Video (WMV) developed by Microsoft™ and VP8/VP9 developed by Google™. On the other hand, AVS video coding is a video coding standard developed by the Audio and Video Coding Standard Workgroup of China and the format is widely used in China. The video coding tool set used for AVS is similar to that for H.264/AVC. However, the complexity of AVS is greatly reduced compared to the H.264/AVC standard. Nevertheless, the coding performance of AVS is comparable to that of H.264/AVC.
Due to the co-existing of compressed video in various video coding formats, a video decoder may have to decode various video formats in order to allow a user to watch video contents coded in different video coding formats. Furthermore, there may be a need for simultaneously decoding two compressed video data coded in different video coding formats. For example, a user may be watching two video sequences displayed on a TV screen in a main/sub-picture (i.e., picture-in-picture) or split screen arrangement, where one sequence is coded in one video coding format while the other sequence is coded in a different format.
FIG. 2 illustrates a typical TV system with built-in audio/video decoder. As shown in FIG. 2, the system uses a CPU bus and DRAM (dynamic random access memory) bus, where the CPU bus is used for CPU command and communication in order to control other modules. The external memory storage (210) is used to store reference pictures for video decoding, decoded pictures for display and other data. The external memory often uses DRAM (dynamic random access memory) and external memory access engine (220) is used to connect the external memory storage to the data bus. The system may include a CPU (230), a video decoder (240), an audio engine (250) and a display engine (260). The video decoder will perform the task of video decoding for compressed video data. The audio engine will perform the task of audio decoding for compressed audio data. The audio engine may also support other audio tasks such as generating audio prompt for user interface. The display engine is responsible for processing video display and generating on-screen display information. For example, the display engine may generate graphic or text information for user interface. The display engine is also responsible for scaling and combining two decoded video data for main window and sub-window display, or split screen display. The CPU may be used to initialize the system, control other sub-systems, or provide user interface for the TV system.
In order to support simultaneous multi-standard video decoding and display, the video decoding system may be configured to decode one coded video data and then switch to decode another coded video data. For example, if the video decoder system needs to simultaneously decode a first video bitstream coded in the HEVC format and a second video bitstream coded in the AVS format, the decoder system may decode one HEVC slice and switch to decode an AVS slice. The decoded HEVC slices and AVS slices can be temporarily stored in output picture buffer. The display engine may access the pictures for picture in picture display or split screen display.
In various newer video standards, context based entropy coding has been widely used. For example, Context-based Adaptive Binary Arithmetic Coder (CABAC) has been used for H.264/AVC, HEVC and AVS. CABAC encoding process consists of three steps: binarization, context modeling, and binary arithmetic coding (BAC). During the binarization stage, the syntax elements (SEs) generated by the coding system, such as quantized transform coefficients or motion information, are binarized into bin strings (i.e., binary strings). Each bit position in the bin string is called a “bin”. Each bin is then processed according to either regular coding mode or bypass mode. During the context modeling stage, the statistics of the coded syntax elements are utilized to update the probability models (i.e., context model) of regular bins. For bins in the bypass mode, context modeling is skipped and the bin is passed directly to a bypass coding engine. In binary arithmetic coding, the value of the bin is used to update the context variable if applicable, and bits are output into the bitstream.
While context based entropy coding is used for H.264/AVC, HEVC, VP8, VP9, AVS and the emerging AVS2, each video coding standard has its own variation of context based entropy coding. In order to support multiple video coding standards, a straightforward approach would require all individual bin decoders, which may noticeably increase the system cost. Therefore, it is desirable to develop area efficient (i.e., smaller silicon area) or high-performance bin decoders for multi-standard video decoder.