1. Field of the Invention
The invention relates to data decoding, and more particularly, to decoding compressed multimedia data being progressively encoded.
2. Description of the Related Art
To decode compressed multimedia data, such as still or a video image, for displaying/playing in an electronic apparatus, such as a digital camera or a DV camcorder, a decoding/rendering flow may comprise procedures of reading and decompressing the compressed multimedia data, and further performing the decoding procedure, image processing step and displaying the final image. In general, Joint Photographic Experts Group (JPEG) compression and bit-planes compression are two popular coding methods respectively for a still image and a video image applied in many multimedia applications.
JPEG defines how an image is compressed into a stream of data and decompressed back into an image. A JPEG progressive mode available as part of the JPEG standard, in which data is compressed in multiple passes of progressively higher detail quickly, provides a rough approximation of the final image, refining the image in later passes, rather than slowly building an accurate image in a single pass. The standard JPEG image data is arranged with DC components and 8×8 discrete cosine transform (DCT) coefficient blocks running left to right and top to bottom through the image. The progressive mode allows the DC components to be sent first, followed by the DCT coefficients in a low-frequency to high-frequency order. This enables a decoder to reproduce a low quality version of the image quickly, before successive (higher frequency) coefficients are received and decoded.
FIG. 1 shows an embodiment of a conventional JPEG decoding apparatus 100.
The conventional progressive JPEG decoding apparatus 100 comprises a variable length decoding (VLD) unit 110, an image-sized coefficient memory buffer 120, an inverse quantization unit 130 and an inverse DCT (IDCT) unit 140. For the progressive mode, sample blocks of an image are typically encoded in multiple scans through the image. The VLD unit 110 performs a variable length decoding operation to the encoded JPEG bit stream which has multiple progressively encoded scan data and generates variable-length-decoded coefficients to the image-sized coefficient memory buffer 120. The image-sized coefficient memory buffer 120 stores the variable-length-decoded coefficients generated by the VLD unit 110. When collecting all the variable-length-decoded coefficients of a scan, the inverse quantization unit 130 performs an inverse quantization operation and then the IDCT unit 140 performs an inverse DCT operation upon these variable-length-decoded coefficients to generate a partially reconstructed image, whereby the partially reconstructed image can first be displayed. The partially reconstructed image can later be refined progressively when the variable-length-decoded coefficients of other scans are also ready and processed the IDCT operations by the IDCT unit 140.
For the conventional progressive JPEG decoding apparatus, however, an image-sized coefficient memory buffer is needed. Once the image to be reconstructed becomes large (e.g. 65,535 by 65,535 pixels), decoding of the image in a decoding apparatus having memory buffer smaller than the size of the image to be reconstructed fails.
In addition to JPEG progressive mode that divides the bitstream into multiple scans, video data can also be divided into multiple layers (hereinafter referred to as “layered video data”), such as one “base layer” and one or more “enhancement layers”. The base layer includes a rough version of the video sequence and may be transmitted using relatively little bandwidth. Typically, the enhancement layers are transmitted at the same time as the base layer, and recombined at the receiving end with the base layer during the decoding process. The enhancement layers provide correction to the base layer, permitting video quality improvement. In general, each enhancement layer is one bit-planes of the difference data. In such an arrangement, each enhancement layer for each picture consists of a series of bits. The enhancement layers are ordered in such a way that the first enhancement layer contains the most significant bits, the second enhancement layer contains the next most significant bits, and so on. Thus, the most significant correction is made by the first enhancement layer. Combining more enhancement layers continues to improve the output quality. Therefore, if each of the transform coefficients is represented by n bits, there are n corresponding bit-planes to be coded and transmitted. In this way, the quality of the output video can be “scaled” by combining different numbers of enhancement layers with the base layer. The process of using fewer or more enhancement layers to scale the quality of the output video is referred to as “Fine Granularity Scalability” or FGS. FGS may be employed to produce a range of quality of output video.
FIG. 2 is a block diagram of a conventional FGS decoding apparatus.
The decoding apparatus 200 comprises a base layer (BL) decoder 210 and an enhancement layer (EL) decoder 230. The BL decoder 210 comprises a variable length decoding (VLD) unit 212, an inverse quantization (Q−1) unit 214, an inverse discrete cosine transform (IDCT) 216, a motion compensation unit 218, a frame memory 220 and an adder 222. The EL decoder 230 comprises a bit-planes VLD unit 232, a bit-planes shift unit 234, an IDCT unit 236 and an adder 238.
VLD unit 214 receives a BL bitstream and performs a VLD operation thereto to provide a decoded data and motion vectors. The decoded data and the motion vectors are sent to the inverse quantization (Q−1) unit 214 and the motion compensation unit 218 respectively. Then, the inverse quantization (Q−1) unit 214 outputs the DCT coefficient data to IDCT unit 216. An IDCT operation is then performed by the IDCT unit 216 to generate video frames to adder 222. Frame memory 220 receives the video frames from adder 222 or clipping 224 and stores the frame as a reference output. The reference output is then fed back into motion compensation unit 218 for use in generating subsequent base layer video frames. The motion compensation unit 218 receives the motion vectors and BL frame data from the BL frame memory 220, and performs motion compensation on the BL frames in memory 220 to provide additional frames to the adder 222. The decoded BL video frame is output from adder 222 to the BL frame memory 220 and the EL decoder 230.
The bit-planes VLD unit 232 of the EL decoder 230 receives the enhancement layer bit stream to provide DCT coefficient data. The inverse DCT unit 236 performs the IDCT operation and outputs the EL frame data that may subsequently be combined with base layer video frame by adder 238 to generate enhance video, which may be stored in a reconstructed frame buffer or sent to a displaying unit. In the decoding apparatus 200, all bit-planes received are decoded. For example, if 7 bit-planes are received, 7 bit-planes are decoded. The decoding of the decoding apparatus 200, however, may be stopped after receiving and decoding a specific number of bit-planes in order to reduce the complexity. For example, if 7 bit-planes are received, the decoding can be stopped after 5 bit-planes have been decoded. However, discarding bit-planes may affect visual quality.
As shown in FIGS. 1 and 2, decoding progressively encoded multimedia data requires a decoding/rendering flow that comprises a variety of procedures in sequence, such as VLD, IDCT and scaling (i.e. scaling the decoded data to fit to display) procedures. Conventionally, the procedures of the decoding/rendering flow for decoding the multimedia data are arranged in a fixed order to save costs. Under different system conditions, the performance for decoding and displaying multimedia data being progressively encoded may become poor and cause decreasing of the system performance.
It is therefore desired to provide methods and apparatus for rendering an image being progressively encoded quickly and effectively under a limited system requirement and provide a way to dynamically change the rendering method according the system environment, such as image size, display size, and storage requirement.