Generally, VTRs are designed to receive and store data signals representing video (and audio information) by recording the data on a magnetic tape in a series of tracks. In addition, most VTRs are designed to support both normal and trick playback operation, i.e., fast forward and reverse operation.
The use of digital video signals, e.g., digital high definition television ("HDTV") signals, which are normally transmitted in a compressed format, present problems with regard to the implementation of trick playback operation in VTRs.
Various systems have been proposed that would locate data selected to be used during trick play operation in specific locations within the tracks on a tape, referred to as trick play tape segments, so that at least a minimum amount of data required to produce recognizable images during trick playback operation can be read in a reliable manner from the tape. However, because of limitations on the amount of data that can be read back from the tape during trick play operation using such systems, video images used for trick play operation must usually be represented using considerably less data than is used to represent images, e.g., frames, that are displayed during VTR normal playback operation.
In order to maintain compatibility with a receiver, it is expected that reduced resolution video frames used for VTR trick play operation will have to conform to the same video data standard that is used to represent full resolution video frames during VTR normal play operation.
The International Standards Organization has set a standard for video data compression for generating a compressed digital data stream that is expected to be used for digital television. This standard is referred to as the ISO MPEG-2 (International Standards Organization--Moving Picture Experts Group) ("MPEG-2") standard.
Because the MPEG-2 standard is likely to be used in a wide variety of digital video applications, it is highly desirable that a VTR be capable of generating video images, e.g., low resolution video frames, that are both MPEG-2 compliant and also suitable for use during trick play operation, e.g., because of the relatively small amount of data used to represent the video images.
For the purposes of this application, unless indicated otherwise, terms will be used in a manner that is consistent with the MPEG-2 standard that is described in the International Standards Organization--Moving Picture Experts Group, Drafts of Recommendation H.262, ISO/IEC 13818-1 and 13818-2 titled "Information Technology Generic Coding Of Moving Pictures and Associated Audio" (hereinafter "the November 1993 ISO-MPEG Committee draft") hereby expressly incorporated by reference. Any references made in this patent application to MPEG-2 data streams is to be understood to refer to data streams that comply with MPEG-2 standards as defined in the November 1993 ISO-MPEG Committee draft.
MPEG-2 provides for video images to be encoded into a series of macroblocks with each macroblock corresponding to a different spatial portion of a video image. Each macroblock includes a plurality of luminance blocks, e.g., four luminance blocks and a plurality of chrominance blocks with each block being encoded using a discrete cosine transform ("DCT") coding operation. Referring now to the figures, FIG. 1A illustrates a 4:2:0 macroblock structure in which four luminance blocks and two chrominance blocks are used to represent the spatial area of a video image corresponding to the macroblock. FIG. 1B illustrates a 4:2:2 macroblock structure which uses four luminance blocks and 4 chrominance blocks while FIG. 1C illustrates a 4:4:4 macroblock structure where 4 luminance blocks and 8 chrominance blocks are used to represent the spatial area of a video image corresponding to the macroblock.
In accordance with MPEG-2, video images, e.g., HDTV frames, may be encoded into either a frame picture or a pair of field pictures. Each frame comprises one or more macroblocks. Macroblocks of a frame picture may be encoded using a frame DCT format or a field DCT format with the resulting macroblock being referred to as a frame macroblock or a field macroblock of a frame picture.
Referring now to FIG. 1D, on the left there is illustrated a section of a video image 11 corresponding to, e.g., a block of 16.times.16 pixels where each row, i.e., raster of pixels, corresponds to, e.g., 16 pixels. On the right, there is illustrated the luminance portion of a frame macroblock of a field picture corresponding to the video image 11 with the 4 blocks 12, 13, 14, 15, each corresponding to, e.g., an 8.times.8 block of pixels of the video image 11 as illustrated. It should be noted for the purpose of clarity, that in FIGS. 1D, 1E and 1F the white portion of the illustrated blocks correspond to one instance in time while the shaded portions correspond to another instance in time.
Referring now to FIG. 1E, on the left there is illustrated a section of a video image 20, e.g., a 16.times.16 block of pixels. On the right there is illustrated the luminance portion of a field macroblock of a frame picture which comprises the four blocks 22, 23, 24, 25. As illustrated, the structure of a field macroblock of a frame picture is such that each block corresponds to only odd or even rows of pixels while the structure of a frame macroblock of a frame picture provides for each block to correspond to both odd and even rows of pixels.
Referring now to FIG. 1F there is illustrated a portion of a video image 30, e.g., a block of 16.times.32 pixels. On the right, there is illustrated the structure for the luminance portion 31, 32 of a macroblock of a first field and a second field of a frame picture. Macroblocks of field pictures, unlike macroblocks of frame pictures, can be encoded using only a single format. As illustrated in FIG. 1F, a macroblock 31 of the first field of a field picture corresponds to the even rows of pixels of the video image 30, e.g., frame, while a macroblock of the second field of a field picture corresponds to the odd rows of pixels of the video image 30.
Accordingly, each macroblock of a field picture spans the same number of columns of pixels as a frame picture macroblock but twice as many rows with each macroblock of the first field containing data corresponding to the even rows and the macroblock of the second field containing data corresponding to the odd rows.
Thus, when represented as field pictures, each frame is represented as two separate fields which represent different instances in time. Accordingly, field pictures represent interlaced video images. MPEG-2 provides for an optional control digital storage media ("DSM") header control byte which includes a two bit field.sub.-- id flag which is used to support freeze frame mode when using interlaced pictures. By setting the field.sub.-- id flag of the DSM byte to a preselected value, a receiver can be made to display only either the first or second field of a field picture or both fields of a field picture.
In the case of trick play, where a frame may be repeated several times, the use of interlaced video may cause annoying flicker as the first and second fields of a video frame which includes motion, are repeated several times. For example, if during trick play, a frame represented as a field picture is repeated three times, the fields of the frame would be displayed in the following sequence field 1, field 2, field 1, field 2, field 1, field 2.
In order to support trick play operation in a digital VTR, e.g., a digital VTR that is designed to work with an MPEG-2 data stream, because of the limited space available on a tape for storing trick play data, it is desirable that a VTR be capable of generating low resolution or reduced resolution images from an MPEG-2 data stream representing, e.g., an HDTV signal, which can then be used to support trick play operation.
Accordingly, there is a need for a method and apparatus for generating video frames, e.g., low resolution video frames, using a small amount of data, that are suitable for recording in trick play tape segments on a tape and for reading back from the tape during trick play operation.
Furthermore, it is highly desirable that the low resolution video frames be capable of being generated from the video data that represents the normal play video data which a VTR is expected to be received by a VTR so that the VTR need not be sent additional data, separate form the normal play video data, to support trick play operation.
In addition, it is desirable that the low resolution video frames used for trick play operation produce a minimal amount of flicker which can be annoying to a viewer.