Motion estimation is an effective inter-frame coding technique to exploit temporal redundancy in video sequences. Motion-compensated inter-frame coding has been widely used in various international video coding standards The motion estimation adopted in various coding standards is often a block-based technique, where motion information such as coding mode and motion vector is determined for each macroblock (MB) or similar block configuration. In addition, intra-coding is also adaptively applied, where the picture is processed without reference to any other picture. The inter-predicted or intra-predicted residues are usually further processed by transformation, quantization, and entropy coding to generate compressed video bitstream. During the encoding process, coding artifacts are introduced, particularly in the quantization process. In order to alleviate the coding artifacts, additional processing has been applied to reconstructed video to enhance picture quality in newer coding systems. The additional processing is often configured in an in-loop operation so that the encoder and decoder may derive the same reference pictures to achieve improved system performance. Such system structure has been widely used in various modern video coding systems such as H.264/AVC and HEVC (High Efficiency Video Coding.
FIG. 1 illustrates an exemplary system block diagram for a video encoder using adaptive Inter/Intra prediction. In the system, a picture is divided into multiple coding units. For Inter-prediction, Motion Estimation (ME)/Motion Compensation (MC, 112) is used to provide prediction data based on video data from other picture or pictures. Switch 114 selects Intra prediction data from Intra Prediction 110 or Inter-prediction data from ME/MC 112. The selected prediction data (136) is supplied to Adder 116 to be subtracted from the input video data in order to form prediction errors, also called residues. The prediction error is then processed by Transformation (T) 118 followed by Quantization (Q) 120. The transformed and quantized residues are then coded by Entropy Encoder 122 to form a video bitstream corresponding to the compressed video data. The bitstream associated with the transform coefficients is then packed with side information such as motion, mode, and other information associated with the image area. The side information may also be subject to entropy coding to reduce required bandwidth. Accordingly, the data associated with the side information are provided to Entropy Encoder 122 as shown in FIG. 1. When an Inter-prediction mode is used, a reference picture or pictures have to be reconstructed at the encoder end as well. Consequently, the transformed and quantized residues are processed by Inverse Quantization (IQ) 124 and Inverse Transformation (IT) 126 to recover the residues. The residues are then added back to prediction data 136 at Reconstruction (REC) 128 to reconstruct video data. The reconstructed video data may be stored in Reference Picture Buffer 134 and used for prediction of other frames.
As shown in FIG. 1, incoming video data undergoes a series of processing in the encoding system. The reconstructed video data from REC 128 may be subject to various impairments due to a series of processing. Accordingly, various in-loop processing 130 is applied to the reconstructed video data before the reconstructed video data are stored in the Reference Picture Buffer 134 in order to improve video quality. The in-loop filter information may have to be incorporated in the bitstream so that a decoder can properly recover the required information. Therefore, in-loop filter information from SAO is provided to Entropy Encoder 122 for incorporation into the bitstream. For the High Efficiency Video Coding (HEVC) standard, the in-loop filter process 130 may correspond to Deblocking Filter (DF) and Sample Adaptive Offset (SAO). For H.264/AVC video standard, the in-loop filter process 130 may correspond to Deblocking Filter (DF).
FIG. 2 illustrates a system block diagram of an exemplary video decoder corresponding to the video encoder in FIG. 1. Since the encoder also contains a local decoder for reconstructing the video data, some decoder components are also used in the encoder. For a decoder, entropy decoder 222 is used to parse and recover the coded syntax elements related to residues, motion information and other control data. The switch 214 selects intra-prediction or inter-prediction and the selected prediction data are supplied to reconstruction (REC) 228 to be combined with recovered residues. Besides performing entropy decoding on compressed video data, entropy decoding 222 is also responsible for entropy decoding of side information and provides the side information to respective blocks. For example, intra mode information is provided to intra-prediction 210, inter mode information is provided to motion compensation 212, in-loop filter information may be provided to in-loop filter 230 and residues are provided to inverse quantization 224. The residues are processed by IQ 224, IT 226 and subsequent reconstruction process to reconstruct the video data. Again, reconstructed video data from REC 228 undergo in-loop filtering 230 as shown in FIG. 2.
Due to the Inter/Intra prediction process and various other processing (e.g. in-loop filtering) used in video coding, data dependency exists among coded data. If any error happens to compressed video data, the effect of errors may propagate such as from block to block, slice to slice, or picture to picture. To alleviate this issue, video coding system often partitions video data to smaller video units and reduce data dependency among video units. In more advanced coding standards such as H.264/AVC and HEVC (high efficiency video coding), a picture is divided into slices or groups of coding units. The compressed video data from each slice or each group of coding units are packetized into a well-defined data structure.
While the well-defined data structure may also be referred as a packet, the term “packet” is different from the term “packet” widely used in switched networks or streaming networks. There are various potential benefits associated with packetized transmission over networks. For example, packet transmission allows easy integration of video, voice and data in the network environment. Nevertheless, the term “packet” in this disclosure is related to the data structure for slices or groups of coding units as specified in H.264/AVC and HEVC.
In practice, a maximum slice size or packet size may be imposed to avoid the need for a very large buffer. With this constraint imposed, a conventional encoder system may try to fit a group of coding units or macroblocks into a slice/packet with size limitation by processing the coding unit or macroblock (MB) one by one. For example, if the n-th coding unit or MB of a group of coding units or MBs is being encoded, the accumulated compressed data up to the n-th coding unit or MB will be checked. If the accumulated compressed data exceeds the maximum slice/packet size, the encoder state is set to the end of the (n−1)-th coding unit or MB. The packetization for the current slice/packet is terminated (i.e., without including the compressed data from the n-th coding unit or MB). The encoding process then starts a new slice/packet and encodes the n-th coding unit or MB as the first coding unit or MB for the new slice/packet. The encoding process then checks the accumulated compressed data and the encoding process continues until all coding units or MBs in a picture are done.
While the above method is intuitive and robust, it is hard to implement in hardware. Accordingly, it is desirable to develop a method and/or system that is more suited for hardware implementation.