This present invention relates to digital image coding, and more particularly to a method and apparatus for encoding digital video into a compressed digital video stream, and a corresponding method and apparatus for decoding a compressed digital video stream.
DVC is a common acronym for a digital video coding standard presently in widespread use for digital handheld camcorders, digital video recorders, digital video playback devices, etc. See Recordingxe2x80x94Helical-scan digital video cassette recording system using 6.35 mm magnetic tape for consumer use (525-60, 625-50, 1125-60 and 1250-50 systems), International Electrotechnical Commission Standard, IEC 61834 (1998). This standard describes the content, formatting, and recording method for the audio, video, and system data blocks forming the helical records on a DVC tape. It also specifies the DVC video frame format for compatibility with different television signal formats, including the 525-horizontal-line, 60 Hz frame rate broadcast format common in the United States (the 525-60 format), and the 625-horizontal-line, 50 Hz frame rate broadcast format common in many other countries (the 625-50 format).
Examining the 525-60 DVC video frame format in particular, FIG. 1 illustrates the digital sample structure for the luminance component of a 525-60 format video frame. A video frame 30 is divided into a tiling of superblocks S0,0 to S9,4. Each superblock takes one of three possible superblock shapes 32, 34, 36, depending on the superblock""s position in frame 30. Also, each superblock is divided into 27 macroblocks. Most of these macroblocks are of the format shown for macroblock 38 (four blocks arranged horizontally), although for superblock shape 36, three macroblocks have the format shown for macroblock 40 (four blocks arranged 2xc3x972).
Macroblocks 38 and 40 each contain four luminance blocks 42. Each luminance block 42 contains 64 digital samples 44, arranged in a regular 8xc3x978 grid. Each macroblock also contains one 64-sample Cr and one 64-sample Cb block (not shown), for a total of six blocks of samples per macroblock. The total frame size is 720 digital samples (90 luminance blocks) wide by 480 digital samples (60 luminance blocks) high.
DVC encoder chips are commercially available. These chips generally have two modes of operation: an encoding mode that converts video frames into an encoded stream of video segments, and a decoding mode that converts an encoded stream of video segments back into video frames. The basic operation of the encoding mode of a DVC encoder chip is shown in FIG. 2 as two concurrent processes, Process A and Process B.
Process A operates on an incoming pixel stream representing a raster-sampled video frame. Block 50 performs a horizontal lowpass filter to smooth the data. The smoothed pixels are gathered at block 52 until eight lines are present, representing 90 blocks of luminance data (45 blocks of chrominance data are also processed concurrently, not shown). An 8xc3x978 Discrete Cosine Transform (DCT) is performed on each of the 90 pixel blocks at 54, and the blocks are stored to frame buffer A at block 56. This process loops until an entire frame of DCT data has been stored to frame buffer A, and then repeats for the next frame using a frame buffer B.
At the same time that Process A is performing DCTs and storing data to frame buffer A, Process B is reading stored DCT data (from the previous frame) from frame buffer B. Thus at block 60, process B selects DCT data corresponding to a video segment that is to be created next. At block 62, it reads five macroblocks, corresponding to this DCT data, from frame buffer B. At block 64, these five macroblocks are encoded together into a fixed-length video segment by a complex quantization and coding process that can allow less-detailed macroblocks to xe2x80x9csharexe2x80x9d unused portions of their bit allotment to more-detailed macroblocks. In general, block 64 results in some loss of data in order to fit the five macroblocks into the allowable space, although the data discarded is selected to (hopefully) have a low impact on perceived picture quality. Finally, at block 66, the encoded video segment is output from the DVC chip and Process B loops back up to produce the next video segment.
The five macroblocks encoded in a DVC video segment are selected from scattered regions of the digital video frame in order to distribute the effects of physical data recording errors. FIGS. 3, 4a, 4b, and 4c illustrate how the five macroblocks corresponding to a particular video segment are selected. Generally, five superblocks S0,0, S1,6, S2,2, S3,8, and S4,4 are coded into the first twenty-seven video segments, each video segment representing one macroblock from each of the five superblocks shown. Scan paths 72 (FIG. 4a), 74 (FIG. 4b), and 76 (FIG. 4c) illustrate the order of macroblock selection for each particular superblock shape. Thus the first video segment will combine the first macroblock in scan path 72 for each of S0,0 and S2,2 with the first macroblock in scan path 74 for each of S1,6 and S3,8 and the first macroblock in scan path 76 from S4,4. The second video segment will combine the second macroblocks in these scan paths, etc.
When the five superblocks shown have been converted into twenty-seven video segments, encoding for those superblocks is complete. The process then performs a second encoding pass using the five superblocks immediately below the first five superblocks to generate 27 more video segments, and repeats. After the bottom superblock in any superblock column has been encoded, the process xe2x80x9cwrapsxe2x80x9d to the head of that column on the next pass and continues until ten passes have been made.
The DVC process provides efficient digital compression for its designed frame formats, and low-cost DVC chips are available. Unfortunately, the staggered five-macroblock-shared video segment design hinders efficient use of the DVC chip with any frame format other than those of its design. For instance, a quarter-VGA (QVGA) frame is 320 pixels wide by 240 pixels high, less than one-fourth the size of a DVC 525-60 frame (720xc3x97480 pixels). If a QVGA subframe were inserted in the top left corner of an otherwise blank DVC 525-60 frame, 3.5 out of every 4.5 pixels in the frame (77.8%) would be blank. But because at least one macroblock of pixels from the QVGA subframe would appear in all but 21 of the 270 video segments created for this frame, over 92% of the fixed-sized video segments must be kept intact in order to preserve the QVGA subframe information. The net result is that the DVC-coded subframe requires more bits to represent a lossy-coded version of the QVGA subframe than the original QVGA subframe required.
The embodiments illustrated herein show an alternative approach that allows standard DVC chips to be used to efficiently code a QVGA subframe, or any other subframe data. Generally, this approach redistributes blocks, from a desired subframe, throughout a DVC frame to correspond with selected DVC video segments, ensuring that video segments of interest will generally be filled with subframe data. By judicious selection of a redistribution mapping, buffer space requirements can be decreased and full DVC compression efficiency can be realized on a subframe. For instance, with a proper redistribution mapping, a QVGA subframe can be represented while discarding 77.8% of the DVC video segments.
In one aspect of the invention, a method for encoding a digital image is disclosed. The method used a digital video coder, such as a DVC coder, that encodes a digital video frame using video segments. A digital image to be encoded is segmented into a set of blocks, and the blocks are presented to the digital video coder as part of a larger, synthesized digital video frame. The blocks are inserted into the digital video frame so as to substantially occupy frame locations corresponding to selected video segments in the video segment encoding order. The synthesized digital video frame is encoded with the digital video coder to produce a coded output stream comprising multiple video segments. From the coded output stream, those video segments corresponding to the digital image are selected.
In another aspect of the invention, a method for transmitting a digital video sequence is disclosed. The method used a digital video coder, such as a DVC coder, that encodes a digital video frame using video segments. An original frame of the digital video sequence is segmented into a set of blocks, and the blocks are presented to the digital video coder as part of a larger, synthesized digital video frame. The blocks are inserted into the digital video frame so as to substantially occupy frame locations corresponding to selected video segments in the video segment encoding order. The synthesized digital video frame is encoded with the digital video coder to produce a coded output stream comprising multiple video segments. From the coded output stream, those video segments corresponding to the digital image are selected and transmitted to a receiver. The selected video segments are inserted into a coded input stream, which is supplied to a digital video decoder for decoding into a second synthesized digital video frame. From the second synthesized digital video frame, reconstructed blocks corresponding to the set of blocks of the original frame of the digital video sequence are selected and combined to form an output digital video frame corresponding to the original frame.
In yet another aspect of the invention, a digital video encoding system is disclosed. The digital video encoding system uses a digital video coder that encodes input digital video frames into output video segments, each video segment representing data from multiple scattered regions of a digital video frame input to the digital video coder. The system also has an input frame buffer, and a mapper to map blocks of data from the input frame buffer to a synthesized digital video frame for input to the digital video coder. The blocks of data are mapped such that they substantially occupy frame locations of the digital video frame corresponding to selected video segments in the video segment encoding order of the digital video coder. The system also has a data selector to select video segments from the digital video coder output corresponding to the blocks of data mapped from the input frame buffer.
In a further aspect of the invention a digital video decoding system is disclosed. The digital video decoding system uses a digital video decoder that decodes input digital video segments into output video frames, each video segment representing data from multiple scattered regions of an output digital video frame. The system also has an input data buffer to buffer video segments, and a data padder to concatenate video segments from the input data buffer with dummy video segments for input to the digital video decoder. The system also has a subframe extractor to map the digital video frame regions corresponding to the video segments supplied from the input data buffer into a reconstructed digital video frame.