Typically, signals, such as audio or video signals, may be digitally encoded for transmission to a receiving device. Video signals may contain data that is broken up in frames over time. Due to high bandwidth requirements, baseband video signals are typically compressed by using video encoders prior to transmission/storage. Video encoders may employ a coding methodology to encode macroblocks within a frame using one or more coding modes. In many video encoding standards, such as MPEG-1, MPEG-2, MPEG-4, H.261, H.262, H.263, H.264, HEVC, etc., a macroblock denotes a square region of pixels, which may be, for example, 16×16 pixels in size. Most of the coding processes (e.g. motion compensation, mode decision, quantization decision, etc.) occur at this level.
In high macroblock rate decoding, the video decode time (e.g., the time required to process on a macroblock basis) exceeds the current hardware capability. Thus, macroblock decoding is typically distributed over multiple processors for performing parallel decode. A high pixel (or macroblock) rate video decode is required for high resolution or high frame rate scenarios for digital cinema or faster than real-time decode. The performance of a decoder, however, is ultimately limited by the sequential nature of the video codec standard.
H.264 is a high complexity codec standard, which has both temporal and spatial dependencies. Normal methods to accelerate the decode macro block rate of H.264 are normally based on assumptions of multiple slices per frame to allow slice parallel decode, or frame parallel decode based on assumption of GOP structure and/or vertical motion vector component limits, or even GOP (scene) parallel methods. However these assumptions are not always true, such as in faster than real-time decode or wide aspect ratio UHD video. Problems may arise in slice parallel and picture parallel decode techniques. For example, in performing slice parallel decode on a CABAC bitstream, multiple slices per frame can cause video quality degradation. Parallel picture decode is also risky in that it requires the decoder to obtain information from every previous frame, which might not always be feasible. Therefore, it would be beneficial to provide systems and methods for performing information extraction and insertion on H.264 bitstreams, which would allow for wavefront parallel decode of the bitstreams.
Turning to the HEVC standard, while it has a configuration setting for supporting wavefront parallel processing (Wpp=1), the setting requires that the quantization parameter and CABAC be reset at the start of each macroblock row. Thus, turning on this setting to support wavefront parallel decode may result in lost video quality, and may be undesirable from a system perspective (e.g., for systems that split entropy decode and macroblock decode). Therefore, it would be beneficial to provide systems and methods for performing information extraction and insertion on HEVC bitstreams, which would allow for and/or improve wavefront parallel decode of the bitstreams.