As the resolution and screen sizes of displays increase and the frame rates of encoded bitstreams become higher, video decoders are increasingly required to support both more processing cycles and processing at faster speeds in order to meet real time decoding requirements for incoming linearly encoded bitstreams. One approach to meeting these real time decoding requirements is to increase processing speed by using faster processing units. This approach is limited by current processor designs, which may not be fast enough to effectively handle processing the decoding of incoming bitstreams, and may additionally require handling large power dissipation. As the rates of linearly encoded bitstreams approach 240 Mbps, relying solely on increasing processing speed may not be practical or sustainable. For example, it may not be possible for the linear scale of decoder processing cycles to meet the real time decoding requirements for larger resolutions such as a 4K×2K 120P decode, which may have a Luma sample rate of more than 1 billion samples per second and require over 2 GHz of processing capacity.
Encoded bitstreams such as high efficiency video coding (HEVC) and H.264 utilizing compressed context-adaptive binary arithmetic coding (CABAC) may be encoded in such a way that a macroblock (MB) or a coding tree unit (CTU) is dependent on a respective previous neighboring MB or CTU. The feedback loop employed during HEVC and H.264 entropy decoding decisions may make decoding tasks unbreakable and not parallelizable using conventional decoding techniques.