Still image electronic digital cameras that also record snippets of video have recently been introduced to the market. Digital video cameras capable of recording still images have also recently been introduced to the market. These cameras store the image data using only intraframe compression. Some cameras achieve a higher compression ratio for video images by using MPEG encoding (e.g. the Hitachi MP-EG1A camera). However, any still images compressed with MPEG will suffer in quality for the reasons described below. As a result, the still images in cameras, such as the Hitachi camera, are encoded in a separate bit stream using for example JPEG encoding.
Suzuki et. al. have patented a use for encoding still images in an MPEG bitstream (U.S. Pat. No. 5,457,675). They wish to encode a hierarchical menu system as a series of still images in an MPEG bitstream. This alleviates the need for a separate graphical processing unit. The still images that are encoded however, are rather low resolution images (menus) that are intended for display on a television, and therefore do not tax the limits of MPEG encoding.
MPEG1 and MPEG2 are lossy compression techniques for reducing the number of bits needed to represent digital video. The following discussion applies to the MPEG family of video compression standards. For simplicity, they will be collectively referred to as MPEG unless specifically stated. In MPEG, the video is broken up into sequences of pictures called Groups of Pictures (GOP). Each GOP contains up to three different types of pictures. The first type is referred to as an I picture because it is intra-coded. This means that this type of picture uses only spatial compression techniques, such as DCT based compression, and does not rely on any temporal information. The second type of picture is called a P picture because it is predicted from the most recent picture that was either an I picture or a P picture. The third type of picture is a B picture which is bidirectionally predicted from the closest I or P pictures. The I and the P pictures are also referred to as reference pictures because they can be referenced by other P or B frames for the purpose of motion compensation.
The pictures are further broken up into macroblocks of 16.times.16 pels (picture elements). There are several different types of macroblocks. Intra-coded macro blocks are encoded spatially. Forward predicted macroblocks contain a motion vector pointing to where this macroblock appeared in the previous reference picture. Backward predicted macroblocks contain a motion vector to where this block will appear in the next reference picture. Bidirectionally predicted macroblocks contain two motion vectors, one pointing to where this macroblock appeared in the previous reference picture, and one pointing to where this macroblock will appear in the next reference picture. The predicted macroblocks may contain not only a motion vector, but also some residual information. This information is a spatially compressed difference between the actual block of pixels in the image and the predicted block of pixels. A skipped macroblock contains no information and is essentially the same as a forward predicted macroblock with zero motion vectors and no residual information.
The I pictures can only contain intra-coded macroblocks. The P pictures may contain intra-coded macroblocks, forward predicted macro blocks, and skipped macroblocks. The B pictures may contain all types of macroblocks. Since there is a lot of temporal redundancy in video, the number of bits required to encode a P picture or a B picture is substantially less than the number of bits required to encode an I picture at the same image quality level. For example, in the TM5 encoder which is an informative part of the MPEG 2 standard document (ISO/IEC 13818-2), as a rule of thumb, the I pictures usually have a filesize that is three times the P pictures and six times the B pictures. In MPEG, the image quality is allowed to vary from one picture to another and it can also vary spatially within the picture. The image quality is heavily dependent upon the encoder implementation which has not been standardized. Different MPEG encoders may make different decisions about how to code a particular macroblock, or what is the best motion vector to describe the motion of that macroblock. As a result, different MPEG encoders will produce different quality images while using the same number of bits. Generally, for a given encoder, as the number of bits spent per second (referred to as the bit rate) decreases, the number of artifacts within each picture will increase. Similarly, as the bit rate increases, the artifacts within each picture will decrease.
In the MPEG standard, the final image quality is determined by the elements of a 2-dimensional 8.times.8 matrix referred to as the quantization matrix. Each element of this matrix determines the step size of the quantizer used to quantize the DCT coefficient located at the corresponding location in the 8.times.8 DCT matrix. For intra-coded blocks, the DC coefficients of the DCT are quantized separately using a parameter called "intra DC mult". The encoder has the choice of either using a default quantization matrix or downloading a custom matrix. For the default matrix, the values of the elements depend on the macroblock type. For example, for nonintra macroblocks, where the DCT is usually performed on motion-compensated difference blocks, all the quantization matrix elements have the same value. On the other hand, for intra macroblocks, the values of the matrix elements are based on the properties of the human visual system such as the contrast sensitivity function (CSF). In general, the quantizer matrix elements are larger for the higher spatial frequencies, corresponding to coarser quantization of such spatial frequencies. These default quantization matrices can be found in the MPEG standard documents. To achieve the various levels of image quality desired by a given application, the elements of the quantization matrix need to be modified. In MPEG, either a default quantization matrix can be used for each picture, or a new set of intra and nonintra quantization matrices can be downloaded at the beginning of each picture. However, the elements of the quantization matrix can also be changed from one macroblock to another within the same picture. In order to minimize the number of overhead bits required to signal this change, only a scaling of the original (default or downloaded) matrix elements are permitted. This scaling is performed with a 5-bit parameter referred to as "quantiser.sub.-- scale.sub.-- code" or MQUANT. The MQUANT parameter is used to provide rate-distortion control. The DC coefficients of the DCT for Intra coded blocks are quantized using a quantization parameter called "intra DC mult".
Since in the MPEG standard, the P or the B frames are encoded with reference to other frames, the picture quality of a given reference frame can potentially affect the picture quality of other frames. For example, if an I picture is encoded with poor image quality, the following P picture that is predicted from this I picture will also suffer from poor image quality unless more bits are spent to encode that P picture. Similarly, if the I and the P pictures are encoded with poor image quality, the B pictures will be encoded with poor image quality as they are predicted from the I and the P pictures. Conversely, if the image quality of the I picture is increased, the image quality of the subsequent P pictures will usually increase without an increase in the number of bits used to encode them. Since the I and the P pictures have higher quality, then the B pictures will also have higher image quality without increasing the number of bits spent on them. It should be noted that the allocation of more bits to a B picture will only improve the image quality of that B picture, as no other pictures are predicted from the B pictures.
MPEG encoders exist in several different levels of flexibility. One level of flexibility is the variability of the bit rate. A variable bit rate encoder allows the number of bits spent on a given picture to be rather arbitrary. For example, on a digital video disk (DVD), the video is encoded using MPEG2. The burst rate for reading from a DVD is around 11 Mb/s (megabits per second). Although a typical bit rate for DVD video is approximately 3.5 Mb/s, a particularly detailed portion of the video can be encoded with significantly more bits. However, even at the maximum bit rate, the number of available bits may not be sufficient to encode a still image in the MPEG bitstream that will have the desired level of quality.
In contrast to variable bit rate encoders, a fixed bit rate encoder only allows for constant bit rates. Generally, this means that the number of bits spent per second is fixed. In practice, this reduces to keeping the number of bits spent per group of pictures (GOP) a constant. It may be impossible to encode a high-fidelity image with an MPEG encoder that is constrained to a constant bit rate because the encoder will not be able to increase the number of bits for a given frame to the level necessary to achieve the desired image quality.
There is a need, therefore, for an improved MPEG compressed video bitstream from which high-fidelity still images can be produced.