The term "set top box" generally signifies a unit that serves to deliver compressed digital video and audio signals in real time usable form to one or more television receivers. The unit may comprise an Application Specific Integrated Circuit (ASIC), which performs decoding and processing functions, and a memory for storing video signal information. The compressed signals may be received over cable from a cable TV source or from any telecommunications source including, for example, satellite broadcast. Various conventional formats have been contemplated for compressed video signals, the standards currently favored being set forth by the Motion Picture Experts Group (MPEG).
The need for effective compression techniques arises from the large amount of information inherent in video picture frames and the high rate at which such information changes in motion picture presentation. Management of such information must meet the capabilities of recording media, such as an optical disc, to perform a high rate of recording and reproduction with acceptable quality, as well as the challenge of real time transmission of video signals.
MPEG is a bi-directional predictive coding compression standard, coded in accordance with discrete cosine transformation (DCT) processing. Picture elements are converted from spacial information into frequency domain information to be processed. Various processing schemes have been developed to implement the MPEG standard. By way of example, reference is made to U.S. Pat. No. 5,198,901 to Lynch of Mar. 30, 1993; to U.S. Pat. No. 5,293,229 to Iu of Mar. 8, 1994; to U.S. Pat. No. 5,311,310 to Jozawa et al. of May 10, 1994; to U.S. Pat. No. 5,361,105 to Iu of Nov. 1, 1994; to U.S. Pat. No. 5,386,234 to Veltman et al. of Jan. 31, 1995; and to U.S. Pat. No. 5,400,076 to Iwamura of Mar. 21, 1995. Those disclosures and citations referenced therein may be consulted for an understanding of the specific details of conventional MPEG compression and decompression arrangements.
MPEG processes video data in groups of sequential frames. An intra-coded frame, or I frame, is encoded using only pixels within an actual original video frame, i.e., independently of other frames, and serves as a reference frame to derive compressed data for other encoded frames in advance of or following the I frame in the encoded frame sequence. The number of actual video frames to be coded into such I frames is set in the MPEG syntax, e.g., one reference frame for each fifteen frames, or every half second. Interspersed among successive I frames are frames generally of increased compression. A prediction is made of the composition of a video frame to formulate a prediction frame, termed a P frame, to be located a specific number of frames following or in advance of the next reference frame, the specific number also set in the MPEG syntax. Information from previous frames as well as later frames may be used in formulating the prediction. A P frame may be encoded from I frame information by partitioning the P frame into blocks of pixels, or motion blocks. A matching block is sought in the I frame for each motion block of the P frame. Motion vectors are used to indicate the displacement in the x and y directions between the matched blocks in the two frames. A P frame, as well as an I frame, may serve as matching block reference information for deriving another P frame. Differences between the motion blocks and the matched blocks are also encoded. P frames are thus represented by less data, and are thus more compressed, than the encoded I frames.
"Delta" information is developed for coding frames, called B frames, between the actual (I) and predicted (P) frames, and between (P) frames also by looking at frames in both directions. Rather than updating a whole frame, only the changed (or delta) information is provided for the delta frames. Thus the total information coded, and then transmitted, is considerably less than required to supply the actual information in the total number of frames.
As illustrated by the above identified patents, various schemes have been developed to carry out MPEG coding and decoding. Transmitted MPEG data generally includes I frame data, motion vector information for P frames and B frames, difference or residue data for predictive coding, and data indicative of a particular coding scheme used.
On decompression, the decoder in sequence uses the reference frames to form the prediction frames, which frames also may be used to construct the delta frames. Data is thus often decoded in an order different from the order in which frames are viewed. Decoding must be several frames ahead of the frame currently shown on video. For proper picture resolution and quality, conventional set top boxes store temporarily at least two frames of image information while an image is built for display on the television screen. The frame signals are received in compressed form, expanded by the decoder chip, and stored in memory. The expanded frame information is then used to derive display image information.
FIG. 1 is a block diagram of an exemplary prior art MPEG decoder that may be used in a set top box. Encoded signals of blocks of a video frame are received successively at the input terminal and buffered at buffer 11. The received signals comprise picture signal data and motion vector data, the latter data being prevalent in B frame and P frame signals. I frame data and P frame data serve as reference block data for the motion vectors contained in other B frame and P frame signals.
A portion of a display frame is illustrated in FIG. 2A, wherein a display object is positioned in a block at the lower left area. FIG. 2B illustrates a portion of a later display frame wherein the object has moved to another position in the display frame displaced in the x and y directions from the location in the frame of FIG. 2A. The original object may have changed somewhat, such as in dimension, shape color, etc., or have remained substantially unchanged. As shown, the object in the later frame occupies portions of four blocks. Video signals for the frame of FIG. 2B are coded with motion vector data indicating location displacement of blocks from the reference frame position as well as difference data that represent changes in picture content.
Blocks of video signal data from the buffer are fed successively to demultiplexer 13, which separates motion vector information from picture signal components. The resulting picture signal is fed to variable length decoder 15, which decodes each block to provide quantized transform coefficients. This block data is then fed successively to inverse quantizer 17 and inverse discrete cosine transform circuit 19 whereby block picture information is recovered.
The motion vector data for the current block is fed from demultiplexer 13 to motion vector calculating circuit 21. The motion vector calculating circuit receives a reference block of picture data from frame memory 23 and provides compensation in accordance with motion vector data for the current block received from the demultiplexer. The resulting block picture data is combined with the picture information recovered from discrete cosine transform circuit 19 at adder 25. The reconstructed picture block thus obtained is stored as a new block in frame memory 23. Frame memory 23 is RAM storage. Frame selector circuit 27 controls arrangement of delivery of the decoded frames, all stored blocks correlated therewith, in the proper order. Reference is made to the Iwamura and Veltman et al. patents, identified previously, for further description of this prior art decoding scheme.
As the frame memory stores decoded blocks of picture information, a large amount of RAM is needed to deliver acceptable picture resolution and quality. Such a large memory requirement makes the set top box expensive. In addition, the frame memory data storage arrangement does not take advantage of the efficiencies of the MPEG block encoding scheme of the received video signals. Such efficiencies would enable higher quality video delivery, such as in HDTV applications, and reduce the cost of the set top box unit such that it would be feasible to build its functions into the television receiver.