1. Field of the Invention
The present invention relates to a method and apparatus for decoding a digital video signal.
2. Description of the Related Art
Digital video signal processing is an area of science and engineering that has developed rapidly over the past decade. The maturity of the moving Picture Expert Group (MPEG) video coding standard represents a very important achievement for the video industry and provides strong support for digital compression and other techniques such as digital modulation and packetization, a s well as VLSI technology, the fundamentals of television have been reinvented for the digital age.
The first U.S. digital television transmission standard developed for broadcast of high and low definition television by a Grand Alliance of companies has been proposed to the Federal Communications Commission (FCC). High definition digital television broadcasts are typically referred to as HDTV, while low definition digital television broadcasts are generally referred to as SDTV. These terms will be used throughout this application, but are not tied to a particular format or standard. Instead, these terms are used to cover the high and low definition digital television of any coding standard (e.g., such as for VTRs and television).
In 1994 SDTV broadcasts became a reality when the first digital television services, broadcasted via satellite, went on the air. The Digital Satellite Service (DSS) units developed by Thomson Consumer Electronics, etc. have been distributed to more than 1 million homes. The highly sophisticated methods of transmitting and receiving digital television not only produce higher-quality television broadcasts, but also create new services, such as movies on demand, interactive programming, multimedia applications as well as telephone and computer services through the television.
Soon, HDTV will become a reality and join SDTV. Accordingly, in the near future, expect advanced television (ATV) broadcasts which include co-existent broadcasts of HDTV and SDTV.
When performing, for example, MPEG video encoding of HDTV, image blocks of 8.times.8 pixels in the spatial domain are converted into 8.times.8 DCT (discrete cosine transform) blocks of DCT coefficients in the DCT or frequency domain. Specifically, in most coding formats such as MPEG, the HDTV signal is divided into a luminance component (Y) and two chroma components (U) and (V). Furthermore, instead of U and V chroma blocks, some standards use color difference signal chroma blocks. For the purposes of discussion only, U and V chroma blocks will be used. Most formats such as MPEG specify different encoding sequences. In each encoding sequence a sequence header identifies the encoding sequence. Furthermore, in each encoding sequence, macro blocks of 8.times.8 DCT blocks of DCT coefficients are formed.
Encoding sequences for HDTV typically include the 4:2:0 encoding sequence, the 4:2:2 encoding sequence, and the 4:4:4 encoding sequence. In the 4:2:0 encoding sequence a macro block consists of four 8.times.8 luminance DCT blocks, one 8.times.8 U chroma DCT block, and one 8.times.8 V chroma DCT block. In the 4:2:2 encoding sequence a macro block consists of four 8.times.8 luminance DCT blocks, two 8.times.8 U chroma DCT blocks, and two 8.times.8 V chroma DCT blocks. Finally, in the 4:4:4 encoding sequences a macro block consists of four 8.times.8 luminance DCT blocks, four 8.times.8 U chroma DCT blocks, and four 8.times.8 V chroma DCT blocks. SDTV includes similar coding sequences, but the DCT blocks are 4.times.4 DCT blocks.
Besides variable length encoding, many standards such as MPEG provide for intra- and inter-coding. Intra-coding is where a field or frame of the digital video signal, referred to as a picture, is encoded based on the pixels therein. Several well known techniques exist for intra-coding. An intra-coded picture is typically referred to as an I-picture.
Inter-coding, sometimes referred to as predictive encoding, is where a picture is encoded based on a reference picture, referred to as an anchor picture. In inter-coding, each macro block (i.e., related luminance and chroma blocks) of the picture being encoded is compared with the macro blocks of the anchor picture to find the macro block of the anchor picture providing the greatest correlation therewith. The vector between the two macro blocks is then determined as the motion vector. The inter-coded digital video signal for the macro block being encoded will then include the motion vector and the differences between the macro block being encoded and the corresponding macro block of the anchor picture providing the greatest correlation.
For example, a series of pictures may have the display order I.sub.1 B.sub.1 B.sub.2 P.sub.1 B.sub.3 B.sub.4 P.sub.2 B.sub.5 B.sub.6 P.sub.3 B.sub.7 B.sub.8 I.sub.2 . . . . The transmitted HDTV signal, however, will have the pictures arranged in the order of encoding as follows: I.sub.1 P.sub.1 B.sub.1 B.sub.2 P.sub.2 B.sub.3 B.sub.4 P.sub.3 B.sub.5 B.sub.6 I.sub.2 B.sub.7 B.sub.8. P-pictures are encoded using the previous I-picture or P-picture as the anchor picture. In the above example, P-pictures P.sub.1, P.sub.2, and P.sub.3 were encoded using I-picture I.sub.1, P-picture P.sub.1, and P-picture P.sub.2, respectively, as the anchor picture.
The B-pictures may be forward encoded, backward encoded, or bi-directionally encoded. For instance, if B-picture B.sub.1 was encoded using I-picture I.sub.1 as the anchor picture, then B-picture B.sub.1 is backward or back encoded. Alternatively, if B-picture B.sub.1 was encoded using P-picture picture P.sub.1 as the anchor picture, then B-picture B.sub.1 is forward encoded. If B-picture B.sub.1 was encoded using both I-picture I.sub.1 and P-picture P.sub.1 (typically a weighted average thereof) as anchor pictures, then B-picture B.sub.1 is bi-directionally encoded.
The headers in the digital video signal indicate whether pictures are I, B, or P-pictures and the direction of encoding. These headers also indicate the group of picture (GOP) size N and the distance between anchor pictures M. The GOP size indicates the distance between I-pictures, which in the above example would be N=12. Since I-pictures and P-pictures are anchor pictures, the distance between anchor pictures in the above example would be M=3. Based on the information provided in the headers, the digital video signal can be properly decoded.
Unfortunately, conventional decoders must store two complete anchor pictures in the spatial domain to decode a digital video signal. Consequently, the memory requirements for conventional decoders are quite larger. Because the memory requirements of a digital decoder account for a large part of the overall device cost, the large memory requirements of conventional digital decoders adversely impacts the cost of such devices.