The present invention relates to the field of image processing. More specifically, the present invention relates to methods and apparatus for efficiently and concurrently applying video encoding techniques to convert analog data into digital formats, such as Digital Video (DV) format. This technique is especially suited for widely-used image compression standards that integrate various algorithms into a compression system, such as the standards specified in the DV Standard (DV-SD or the xe2x80x9cBlue Bookxe2x80x9d), which is included herein by reference in its entirety and for all purposes.
The DV format is quickly becoming the standard for many consumer electronic video devices. For example, DV format camcorders can now be found with more frequency and at more competitive prices than the conventional analog 8 mm and VHS camcorders. At the same time, DV camcorders provide advantages which are inherent to digital technology, such as high quality of video and sound, digital filtering, digital error correction, and the like. DV provides quality at or higher than the high-end of the conventional analog camcorders such as Hi-8 mm and S-VHS, with much added flexibility. Also, digital format data can be repeatedly copied without loss of quality.
In the DV standard, the compression ratio is expected to be around 5:1. This means that excessive truncations of redundant data are unnecessary and image quality will not be compromised as much as some other digital standards available on the market now. In order to maintain a constant compression ratio, the compression analysis can be performed on the image at its video segment level. Each image frame consists of 270 video segments under National Television System Committee (NTSC) standard. Each image consists of 324 under Phase Alternation System (PAL) standard. Each video segment consists of five (5) macro blocks and each macro block contains six (6) blocks of 8xc3x978 pixels.
The DV standard utilizes the 8xc3x978 blocks in performing compression (also known as, xe2x80x9cframexe2x80x9d compression). This data is provided by digitizing an image frame by frame originally in analog format. The analog image signals can originate from cable TV, analog camcorders, video cassette recorders (VCR""s), and other similar analog sources. After digitization and encoding, the signals representing the image can be utilized by digital devices.
A well known technique in the prior art for compression of digitized data is to apply discrete cosine transform (DCT) to a block of data to transform the data from the spatial domain to the frequency domain. The resulting coefficients in the frequency domain act as weighing factors corresponding to their respective cosine curve. For background part of the image data, coefficients corresponding to higher frequency data will have a lower value. Conversely, coefficients corresponding to lower frequency data will have a higher value.
The transformation from the spatial domain to the frequency domain, however, does not by itself compress the digital data. After digital data is transformed into the frequency domain, an adaptive quantization can be applied to compress the data. In particular, adaptive quantization truncates the coefficients corresponding to high frequency data, and in most cases to zero. In essence, adaptive quantization will compress an image by deleting the extreme details of an image.
In the DV standard, a user can choose from one of two kinds of DCT transforms. FIG. 1 illustrates the two options provided by the DV standard. Box 102 illustrates an 8xc3x978 block of pixels. Even rows are identified by circles and labeled as rows 0, 2, 4, and 6. Odd rows in the box 102 are shown by X""s and labeled as rows 1, 3, 5, and 7. Under the DV standard, the image block shown in the box 102 can be treated as two separate images. The separation is illustrated in FIG. 1 by boxes 104 and 106. Box 104 contains the image data from the even rows. Box 106 contains the image data from the odd rows. Under the DV standard, the DCT transformation can be applied to either the 8xc3x978 block shown in the box 102 or individually to blocks of data in boxes 104 and 106. Application of compression to the blocks individually is also know as xe2x80x9cfieldxe2x80x9d compression. This feature of the DV standard improves the image quality, especially for the moving pictures.
For example, in some DV camcorders, a user can be given the choice of choosing which DCT-type transformation is applied to a given recording session. Different settings can be provided for sports events, still images, and the like. The sports mode can, for example, indicate that a user wants to capture images from a scene containing moving objects, whereas the still mode can indicate that a user is not going to be capturing images from a scene containing moving objects.
FIG. 2 illustrates an example of how selecting a 2xc3x974xc3x978 DCT-type versus an 8xc3x978 DCT-type transformation will improve the quality of an image containing moving objects. Box 202 illustrates a video frame in accordance with the DV standard having a resolution of 720xc3x97480 for NTSC and 720xc3x97576 for PAL systems. Within the frame, an object 204 is shown and an arrow 206 illustrates the movement of the object 204. As a result of the movement, the object 204 will shift to a new location, such as shown in box 208. Again, the box 208 is a representation of the image having a 720xc3x97480 resolution. Box 210 is an exemplary illustration of what would happen to an image of the moving object 204 if an 8xc3x978 DCT-type transformation were to be applied to the image of the moving object. As shown, the object 204 can be divided into objects 204A, 204B, 204C, and 204D. The image illustrated in the box 210 is merely illustrative and the amount of jaggedness of the object can be dependent upon many factors, such as the speed of the moving object 204, the rate at which the analog image is digitized, and the like.
Generally, for a flicker-free image quality, a video digitization device must be able to digitize at least thirty frames per second for NTSC and 25 frames per second for PAL. When dealing with frames containing moving objects, the 2xc3x974xc3x978 DCT-type transformation will provide a higher quality image because odd and even fields of an image are transformed separately. Because compression is applied separately to these fields, the outlines of a moving object will be less likely to be jagged in the DV format video. Therefore, it is advantageous to apply a 2xc3x974xc3x978 DCT-type transformation (also known as xe2x80x9cfieldxe2x80x9d transformation).
Conversely, an 8xc3x978 DCT-type transformation is more advantageous with frames containing more still objects. One of these advantages is that a more efficient compression can be performed because the whole 8xc3x978 block is considered when applying adaptive quantization. Also, as one would expect, applying 2xc3x974xc3x978 DCT-type transformation to a still image will provide less efficient compression and can lower image quality unnecessarily.
As a result, a technique is desirable wherein a decision can automatically be made whether to use an 8xc3x978 DCT-type transformation or a 2xc3x974xc3x978 DCT-type transformation on an 8xc3x978 block.
The present invention provides new and improved apparatus and methods for video encoding, for example, to efficiently and concurrently apply encoding techniques to convert analog data into digital formats, such as Digital Video (DV) format. A pipelined system receives a block of video data and based on the computations and comparisons performed on the pixels within the block of video data determines which type of transformation is most appropriate for a given block of video data. In an embodiment, the pipelined system performs selected operations in parallel to save time and increase speed.
In another embodiment, a method is provided for determining whether to apply a transformation to selected portions of an image individually. This embodiment determines sum of pixel values for pixels in the different portions under consideration. The difference between the sum values is determined and compared with a threshold value. If the determined difference is higher than the threshold value, the transformation is applied to the different portions of the image individually.
In yet another embodiment, a method is disclosed for determining whether to apply a transformation to selected portions of an image individually. The method determines a cross product of the first portion of the image and the second portion of the image. If the determined cross product is less than a threshold value, the transformation is applied to the different portions of the image individually.
In a further embodiment, an apparatus is disclosed for determining whether to apply a transformation to a selected portions of an image individually. A first adder calculates the sum of pixel values for all pixels in the selected portions of the image. A second adder is coupled to the first adder and determines a difference between the calculated sums. A comparator is coupled to the second adder and compares the determined difference with a threshold value. The apparatus applies the transformation to the selected portions of the image individually if the determined difference is higher than the threshold value.
For further understanding of the nature and advantages of the present invention, together with other embodiments, reference should be made to the ensuing detailed description taken in conjunction with the accompanying drawings.