The invention relates to digital video (DV), and more specifically, to methods and systems for DV encoding and decoding.
Digital video (DV) has became a popular technology, and as with most developing technologies, products thereof are now affordable to average consumers. DV markets have expanded exponentially, and digital cameras are one of the most popular products. One inherent impasse of the digital camera is conversion and reconversion of a mass amount of video data that represents the recorded digital images to a computer system such that the user of the computer system can manipulate, transfer, or store the digital images. The process of transferring video data into a digital format for online distribution or recording to disc is called encoding, and the recovery operation decoding. Sophisticated encoding techniques have been developed to encode and compress digital information into ever smaller space for convenience. Common digital image encoding techniques include JPEG, MPEG, and DV encoding technique. DV encoding is a more efficient method since it generates variable length coding (VLC) to encode as much data into as little space as possible without losing detailed information.
Variable length coding (VLC) distributes coded data throughout a fixed encoded data structure. In the context of the DV specification, a PAL system employs a video frame containing 1620 macro blocks, whereas an NTSC system employs a video frame containing 1350 macro blocks. Each macro block comprises four luminance (Y) discrete cosine transformation (DCT) blocks, and two chrominance (Cr and Cb) DCT blocks. The total picture elements in a video frame are divided into 60 super blocks for PAL, and 50 super blocks for NTSC. Each super block consists of 27 macro blocks. Furthermore, a video segment comprises 5 macro blocks from various areas within the video frame. The macro blocks in the video frame are shuffled by forming video segments. On average, each macro block is compressed from 384 to 77 bytes. This shuffling process averages out the frequency characteristics of the data and hence reduces the degree of difficulty of compression.
Three well known DV formats, MiniDV, DVCAM, and DVCPRO all utilize DV encoding (or DV compression). The compression ratio is 5:1, and the data rate is fixed at 25 Mbps, such that DV compression is consistent and file size does not vary as a file is recorded or played back. DV compression uses intraframe discrete cosine transform (DCT) compression to reduce the size of the file being recorded. Each individual frame is compressed and there is no reliance on adjacent frames for color or other data. FIG. 1 depicts a simplified system block diagram for DV encoding. DV data retrieved from a storage module/device 12 is first analyzed by the DCT algorithm 13 and converted to frequency domain coefficients. The converted data is then provided to perform weighting (W) 14, wherein the direct current (DC) component is lightly shifted, and high frequency components are scaled to be less significant, since the human eye is less sensitive to high frequency components. After weighting, the data is provided to perform quantization (Q) 15 and scanning 16. The high frequency zones of the quantized data contain long runs of zeros, and the quantized data is read in specific order in the scan module 16. Next, the data passes a run-length coding (RLC) module 17 and a variable length coding (VLC) module 18. The VLC module 18 implements a multi-stage encoding scheme.
FIG. 2 is a flowchart illustrating 3-stage (or 3-pass) VLC encoding, during which the variable length coded data is filled in a video segment with a constant size and predefined data format. As previously described, each video segment comprises 5 macro blocks, and each macro block comprises 6 DCT blocks (4 luminance blocks and 2 chrominance blocks). As shown in FIG. 3a, the data is entered in each corresponding DCT block 311˜316 of a macro block 31 at the pass 1 encoding stage, where the shaded area illustrates the amount of data filled in each DCT block 311˜316. For example, there is excess data for the DCT block 311, as well as insufficient data for the DCT blocks 314, 315, and 316. The excess data is temporarily stored in a buffer (identified by numeral reference 11 in FIG. 1), such as static random access memory (SRAM). At the pass 2 encoding stage, the excess data of a DCT block can be entered in any unfilled DCT block within the same macro block. As shown in FIG. 3b, the excess data of the DCT block 311 is entered in the DCT block 314. At the pass 3 encoding stage, any excess data that cannot find space within the same macro block during pass 2 encoding stage can search for space in other macro blocks within the same video segment. As shown in FIG. 3c, a video segment 3 contains 5 macro blocks 31˜35, allowing excess data of the macro block 33 to be entered in the macro block 31 at the pass 3 encoding stage.
FIG. 4 is a simplified block diagram of DV decoding. The DV decoding system executes the inverse process to DV encoding. Encoded DV data retrieved from a storage means 41 is provided to a variable length decoding (VLD) module 42 and a run-length decoding (RLD) module 43. The decoded data is then passed to an inverse scan (ISCAN) module 44, inverse quantization (IQ) module 45, inverse weighting (IW) module 46, and an inverse DCT (IDCT) module 47.
The VLD decoding procedure performed in the VLD module 42 is illustrated by the flowchart of FIG. 5. The VLD module 42 can also perform a 3-stage (or 3-pass) decoding process, similar to the VLC encoding process discussed. The encoded DV data can be correctly decoded from each segment by executing pass 1, pass 2, and pass 3 decoding accordingly.