The invention relates to an image processing apparatus and method, and more particularly, to an image processing apparatus with hardware sharing for MPEG and JPEG encoding.
The Moving Picture Experts Group (MPEG) technology utilizes encoded Joint Photographic Experts Group (JPEG) images displayed in sequence in an arranged order to generate a motion effect. MPEG-1/2/4 standards relate to video and audio, encoding and decoding technologies, applied to digital camcorders, recorders, players, etc. JPEG standards encompass still image encoding and decoding technologies. JPEG is a general file format for digital cameras. Encoding technologies of MPEG and JPEG standards are further described in the following.
JPEG includes two classes of encoding and decoding processes, comprising a lossy process, which is DCT-based and is sufficient for many applications, and a lossless process, which is prediction-based. Further, JPEG includes four modes of operation, comprising a sequential DCT-based mode, a progressive DCT-based mode, a lossless mode, and a hierarchical mode.
With respect to sequential DCT-based mode, an image is first partitioned into blocks of 8×8 pixels, and the blocks are processed from left to right and top to bottom. Additionally, 8×8 2-D forward Discrete Cosine Transform (DCT) is applied to each block. 8×8 DCT coefficients are quantized and the quantized DCT coefficients are encoded and output.
With respect to progressive DCT-based mode, similar to sequential DCT-based mode, quantized DCT coefficients, however, are first stored in a buffer. DCT coefficients in the buffer are then encoded by a multiple scanning process. In each scan, quantized DCT coefficients are partially encoded by either spectral selection or successive approximation. In spectral selection, quantized DCT coefficients are divided into multiple spectral bands according to a zigzag order. Further, in each scan, a specified band is encoded. In successive approximation, a specified number of the most significant bits (MSB) of quantized coefficients are first encoded. In subsequent scans, less significant bits (LSB) are encoded.
With respect to lossless coding mode, Differential Pulse Code Modulation (DPCM) coding is implemented in a spatial domain. With respect to hierarchical mode, an image is first spatially down-sampled to a multiple layer pyramid. This sequence of hierarchical frames is encoded by predictive coding. Except for the first frame, the encoding process is applied to the differential frames. Hierarchical coding mode provides a progressive presentation similar to progressive DCT-based mode but is useful in applications that have multiple resolution requirements. Hierarchical mode also enables progressive coding to a final lossless stage.
A video stream is a sequence of video frames. Each frame is a still image. A video player displays one frame after another, usually at a rate close to 30 frames per second. Frames are divided into 16×16 pixel Macro Blocks (MB). Each MB consists of four 8×8 luminance blocks and two 8×8 chrominance blocks (1 U and 1 V). MBs are the units for motion-compensated compression. Blocks are used for DCT compression.
Video data complying with MPEG format files is composed of three different types of frames, comprising intra-frames (I-frames), forward predicted frames (P-frames), and bidirectional predicted frames (B-frames). An I-frame is encoded as a single image, with no reference to any past or future frame, referring to the fact that various lossless and lossy compression techniques are performed relative to information that is contained only within the current frame, and not relative to any other frame in the video sequence. In other words, no temporal processing is performed outside of the current frame. A P-frame is encoded relative to a closest preceding reference frame. A reference frame is a P- or I-frame. Each MB in a P-frame can be encoded as either an Intra or Inter MB. An Intra MB is encoded just like a MB in an I-frame, which is encoded with no reference frame. A B-frame is encoded relative to the past reference frame, the future reference frame, or both frames. The future reference frame is the closest following reference frame (I or P). The encoding for B-frames is similar to P-frames, except that motion vectors may refer to areas in the future reference frames. For MBs that use both past and future reference frames, the two 16×16 areas are averaged.
As described, MPEG and JPEG pictures have different resolution and file formats, encoding and decoding pictures with different processing methods, and traditionally use separate memory buffers for encoding and decoding. The JPEG encoder normally encodes pictures with higher resolution than that encoded by the MPEG encoder and performs some picture processing other than those for the MPEG encoder, such as resolution change or special color mode.