It is desirable for a broadcast video application to provide support for diverse user devices, without incurring the bitrate penalty of simulcast encoding. Video decoding is a complex operation, and the complexity is dependent on the resolution of the coded video. Low power portable devices typically have very strict complexity restrictions and low resolution displays. Simulcast broadcast of two or more video bitstreams corresponding to different resolutions can be used to address the complexity requirements of the lower resolution devices, but requires a higher total bitrate than a complexity scalable system. Accordingly, there is a need for a solution that allows for complexity scalable CODECs while maintaining high video coding bitrate efficiency.
Many different methods of scalability have been widely studied and standardized, including SNR scalability, spatial scalability, temporal scalability, and fine grain scalability, in scalability profiles of the MPEG-2 and MPEG-4 standards. Most of the work in scalable coding has been aimed at bitrate scalability, where the low resolution layer has a limited bandwidth. As shown in FIG. 1, a typical spatial scalability system is indicated generally by the reference numeral 100. The system 100 includes a complexity scalable video encoder 110 for receiving a video sequence. A first output of the complexity scalable video encoder 110 is connected in signal communication with a low-bandwidth network 120 and with a first input of a multiplexer 130. A second output of the complexity scalable video encoder 110 is connected in signal communication with a second input of the multiplexer 130. An output of the low bandwidth network 120 is connected in signal communication with an input of a low resolution decoder 140. An output of the multiplexer 130 is connected in signal communication with an input of a high bandwidth network 150. An output of the high bandwidth network 150 is connected in signal communication with an input of a demultiplexer 160. A first output of the demultiplexer 160 is connected in signal communication with a first input of a high resolution decoder 170, and a second output of the demultiplexer 160 is connected in signal communication with a second input of the high resolution decoder 170. Outputs of the low resolution decoder 140 and the high resolution decoder 170 are available externally from the system 100.
Scalable coding has not been widely adopted in practice, because of the considerable increase in encoder and decoder complexity, and because the coding efficiency of scalable encoders is typically well below that of non-scalable encoders.
Spatially scalable encoders and decoders typically require that the high resolution scalable encoder/decoder provide, additional functionality than would be present in a normal high resolution encoder/decoder. In an MPEG-2 spatial scalable encoder, a decision is made whether prediction is performed from a low resolution reference picture or from a high resolution reference picture. An MPEG-2 spatial scalable decoder must be capable of predicting either from the low resolution reference picture or the high resolution reference picture. Two sets of reference picture stores are required by an MPEG-2 spatial scalable encoder/decoder, one for low resolution pictures and another for high resolution pictures. FIG. 2 shows a block diagram for a low-complexity spatial scalable encoder 200 supporting two layers, according to the prior art. FIG. 3 shows a block diagram for a low-complexity spatial scalable decoder 300 supporting two layers, according to the prior art.
Turning to FIG. 2, a spatial scalable video encoder supporting two layers is indicated generally by the reference numeral 200. The video encoder 200 includes a downsampler 210 for receiving a high-resolution input video sequence. The downsampler 210 is coupled in signal communication with a low-resolution non-scalable encoder 212, which, in turn, is coupled in signal communication with low-resolution frame stores 214. The low-resolution non-scalable encoder 212 outputs a low-resolution bitstream, and is further coupled in signal communication with a low-resolution non-scalable decoder 220.
The low-resolution non-scalable decoder 220 is coupled in signal communication with an upsampler 230, which, in turn, is coupled in signal communication with a scalable high-resolution encoder 240. The scalable high-resolution encoder 240 also receives the high-resolution input video sequence, is coupled in signal communication with high-resolution frame stores 250, and outputs a high-resolution scalable bitstream.
Thus, a high resolution input video sequence is received by the low-complexity encoder 200 and down-sampled to create a low-resolution video sequence. The low-resolution video sequence is encoded using a non-scalable low-resolution video compression encoder, creating a low-resolution bitstream. The low-resolution bitstream is decoded using a non-scalable low-resolution video compression decoder. This function may be performed inside of the encoder. The decoded low-resolution sequence is up-sampled, and provided as one of two inputs to a scalable high-resolution encoder. The scalable high-resolution encoder encodes the video to create a high-resolution scalable bitstream.
Turning to FIG. 3, a spatial scalable video decoder supporting two layers is indicated generally by the reference numeral 300. The video decoder 300 includes a low-resolution decoder 360 for receiving a low-resolution bitstream, which is coupled in signal communication with low-resolution frame stores 362, and outputs a low-resolution video sequence. The low-resolution decoder 360 is further coupled in signal communication with an upsampler 370, which, in turn, is coupled in signal communication with a scalable high-resolution decoder 380.
The scalable high-resolution decoder 380 is further coupled in signal communication with high-resolution frame stores 390. The scalable high-resolution decoder 380 receives a high-resolution scalable bitstream and outputs a high-resolution video sequence.
Thus, both a high-resolution scalable bitstream and low-resolution bitstream are received by the low-complexity decoder 300. The low-resolution bitstream is decoded using a non-scalable low-resolution video compression decoder, which utilizes low-resolution frame stores. The decoded low-resolution video is up-sampled, and then input into a high-resolution scalable decoder. The high-resolution scalable decoder utilizes a set of high-resolution frame stores, and creates the high-resolution output video sequence.
Turning to FIG. 4, a non-scalable video encoder is indicated generally by the reference numeral 400. An input to the video encoder 400 is connected in signal communication with a non-inverting input of a summing junction (adder or other means for signal combination/comparison) 410. The output of the summing junction 410 is connected in signal communication with a transformer/quantizer 420. The output of the transformer/quantizer 420 is connected in signal communication with an entropy coder 440, where the output of the entropy coder 440 is an externally available output of the encoder 400.
The output of the transformer/quantizer 420 is further connected in signal communication with an inverse transformer/quantizer 450. An output of the inverse transformer/quantizer 450 is connected in signal communication with a first non-inverting input of a summing junction (adder or other means for signal combination) 488. An output of the summing junction 488 is connected in signal communication with an input of a deblock filter 460. An output of the deblock filter 460 is connected in signal communication with reference picture stores 470. A first output of the reference picture stores 470 is connected in signal communication with a first input of a motion estimator 480. The input to the encoder 400 is further connected in signal communication with a second input of the motion estimator 480. The output of the motion estimator 480 is connected in signal communication with a first input of a motion compensator 490. A second output of the reference picture stores 470 is connected in signal communication with a second input of the motion compensator 490. The output of the motion compensator 490 is connected in signal communication with an inverting input of the summing junction 410 and with a second non-inverting input of the summing junction 488.
Turning to FIG. 5, a non-scalable video decoder is indicated generally by the reference numeral 500. The video decoder 500 includes an entropy decoder 510 for receiving a video sequence. A first output of the entropy decoder 510 is connected in signal communication with an input of an inverse quantizer/transformer 520. An output of the inverse quantizer/transformer 520 is connected in signal communication with a first input of a summing junction (adder or other means for signal combination/comparison) 540.
The output of the summing junction 540 is connected in signal communication with a deblocking filter 590. An output of the deblocking filter 590 is connected in signal communication with reference picture stores 550. The reference picture stores 550 is connected in signal communication with a first input of a motion compensator 560. An output of the motion compensator 560 is connected in signal communication with a second input of the summing junction 540. A second output of the entropy decoder 510 is connected in signal communication with a second input of the motion compensator 560. The output of the de blocking filter 590 provides the output of the video decoder 500.
It has been proposed that H.264/MPEG AVC be extended to use a Reduced Resolution Update (RRU) mode. The RRU mode improves coding efficiency at low bitrates by reducing the number of residual macroblocks (MBs) to be coded, while performing motion estimation and compensation of full resolution pictures. Turning to FIG. 6, a Reduced Resolution Update (RRU) video encoder is indicated generally by the reference numeral 600. An input to the video encoder 600 is connected in signal communication with a non-inverting input of a summing junction (adder or other means for signal combination/comparison) 610. The output of the summing junction 610 is connected in signal communication with an input of a downsampler 612. An input of a transformer/quantizer 620 is connected in signal communication with an output of the downsampler 612 or with the output of the summing junction 610. An output of the transformer/quantizer 620 is connected in signal communication with an entropy coder 640, where the output of the entropy coder 640 is an externally available output of the encoder 600.
The output of the transformer/quantizer 620 is further connected in signal communication with an input of an inverse transformer/quantizer 650. An output of the inverse transformer/quantizer 650 is connected in signal communication with an input of an upsampler 655. A first non-inverting input of an adder (summing junction or other signal combining means) 688 is connected in signal communication with an output of the inverse transformer/quantizer 650 or with an output of the upsampler 655. An output of the adder 688 is connected in signal communication with an input of a deblocking filter 660. An output of the deblocking filter 660 is connected in signal communication with an input of reference picture stores 670. A first output of the reference picture stores 670 is connected in signal communication with a first input of a motion estimator 680. The input to the encoder 600 is further connected in signal communication with a second input of the motion estimator 680. The output of the motion estimator 680 is connected in signal communication with a first input of a motion compensator 690. A second output of the reference picture stores 670 is connected in signal communication with a second input of the motion compensator 690. The output of the motion compensator 690 is connected in signal communication with an inverting input of the summing junction 610 and with a second non-inverting input of the adder 688.
Turning to FIG. 7, a Reduced Resolution Update (RRU) video decoder is indicated generally by the reference numeral 700. The video decoder 700 includes an entropy decoder 710 for receiving a video sequence. An output of the entropy decoder 710 is connected in signal communication with an input of an inverse quantizer/transformer 720. An output of the inverse quantizer/transformer 720 is connected in signal communication with an input of an upsampler 722. An output of the upsampler 722 is connected in signal communication with a first input of a summing junction (adder or other means for signal combination/comparison) 740.
An output of the summing junction 740 is connected in signal communication with full resolution reference picture stores 750 and with a deblocking filter 790. The full resolution reference picture stores 750 is connected in signal communication with a motion compensator 760, which is connected in signal communication with a second input of the summing junction 740. An output of the deblocking filter 790 provides the output of the video decoder 700.