1. Field of the Invention
The present invention relates to bufferless compression of video data.
2. The Prior Art
With the development of multi-media systems, the prospect of inputting live video into a computer system has become common. Video capture chips are used for capturing still image or live video, and may be used together with a video sensor and signal processing circuit to create a video camera. Although it would be desirable to include a USB interface in the video capture chip to interface with a computer, the USB interface has a much smaller bandwidth than the camera generates.
At present, a USB interface has a bandwidth of 12 M bits per second, and only 8 M bits per second can be allocated to a single isochronous channel. In order to capture live video at a high resolution, the image data could be compressed For example, a data rate for Common Interchange Format (CIF) resolution video (352xc3x97288) in 4:2:0 format at a rate of 30 frames per second is approximately 35.6 M bits/s. One way to transmit this data across a USB using a 8 M bits/s channel is to compress this data at a compression ratio of approximately 4.5:1. However, known lossless compression engines are not generally this effective, and all lossy compression engines utilize an intermediate buffer for compression of video data. This intermediate buffer substantially increases the manufacturing costs of such a system. Accordingly, hardware costs could be substantially reduced if this intermediate buffer were eliminated. Moreover, less CPU power is required to decompress the data.
During MPEG I and MPEG II encoding, each macroblock is processed. Each macroblock comprises a plurality of pixels, each of which is defined by color space components. A color space is a mathematical representation for a color. For example, RGB, YIQ, and YUV are different color spaces which provide different ways of representing a color which will ultimately be displayed in a video system. A macroblock in YUV format contains data for all Y, U, V components. Y is the luma component, or black and white portion, while U and V are color difference components.
Pixels in each macroblock are traditionally stored in blocks since they are compressed. Each block comprises 8 lines, each line having 8 pixels. Three types of macroblocks are available in MPEG 2. The 4:2:0 macroblock consists of four Y blocks, one U block, and one V block. A 4:2:2 macroblock consists of four Y blocks, two U blocks, and two V blocks. A 4:4:4 macroblock consists of four Y blocks, four U blocks, and four V blocks.
During encoding, a Discrete Cosine Transform (DCT) is performed on each 8xc3x978 block of pixels within each macroblock, resulting in an 8xc3x978 block of horizontal and vertical frequency coefficients. Typically, the DCT process is two dimensional, where DCT is performed on each row and column of pixels. However, the two dimensional process is difficult to perform without an intermediate buffer to store 8 lines of video data. It would be desirable to perform the DCT process without this intermediate buffer, resulting in an increase in efficiency of the DCT process and a decrease in hardware costs.
Resolution of video is often different from the resolution of the computer display on which the video will be displayed. In order to display the video on various computer displays, the video resolution often should be scaled to fit within a desired window, such as by vertical and horizontal scaling. Scaling down can be performed by averaging, while scaling up can be accomplished by interpolation.
Various color formats have been developed for use with image and video encoding and decoding. To facilitate the transfer of data, most MPEG II video encoders accept various video formats, such as the 4:2:2 YUV video format, and use the 4:2:0 format for data storage. Therefore, color format conversion from the 4:2:2 format to the 4:2:0 format is known to be performed. In known systems, color format conversion and scaling are performed in two separate processes. It would be extremely advantageous if vertical scaling and color format conversion could be combined into one process. Through combining these two processes, efficiency of the video capture chip could be improved with a reduced hardware cost.
Accordingly, it would be desirable to provide a method and system for capturing still images or live video with improved efficiency and reduced hardware costs. These advantages are achieved in an embodiment of the invention in which color format conversion and vertical scaling are performed in one process, in which a one-dimensional DCT process is performed without an intermediate buffer, and in which Huffman coding is tailored to the particular DCT.
The present invention provides a video capture chip with a USB interface. When combined with a video sensor and signal processing circuit, the video capture chip is capable of capturing live video and still images, and sending the data through a USB to a computer. With the addition of application software, the present invention may be used in a video camera, surveillance watcher, scanner, copier, fax machine, digital still picture camera, or other similar device.
According to a first aspect of the present invention, a method for combining vertical scaling and color format conversion is disclosed. Vertical scaling and 4:2:2 to 4:2:0 color format conversion are simultaneously performed on incoming Y, U, and V data. According to a presently preferred embodiment of the present invention, each byte of the Y, U, and V data are separated. A scaling factor is determined, the scaling factor indicating a number of bytes to average. When the scaling factor is equal to 1, a 2:1 scale down is performed for each U and V byte. When the scaling factor is equal to f, where f is greater than 1, a 2f:1 scale down is performed for each U and V byte when the scaling factor is equal to f. In addition, when the scaling factor is equal to f, where f is greater than 1, an f:1 scale down is performed for each Y byte. Through the reduction of the vertical scaling and color format conversion into one process, the line buffer size and logical gate count may be reduced by half.
According to a second aspect of the present invention, a method for performing a one dimensional DCT on a line of pixels to create a DCT coefficient y(u) is disclosed. According to a presently preferred embodiment of the present invention, a sequence of pixels is accepted. A cosine operation is then performed on adjacent sets of the sequence of pixels to generate a sequence of one dimensional DCT coefficients. This is accomplished without storing the sequence in a buffer through use of a register. Through elimination of the buffer required in the traditional two dimensional DCT, efficiency is improved, and manufacturing costs are substantially reduced.
According to a third aspect of the present invention, a method for compressing DCT coefficients, or other data, is disclosed to offset the lower compression ratio resulting from the one dimensional DCT. According to a presently preferred embodiment of the present invention, a plurality of DCT coefficients are accepted. A pattern code is then generated for the plurality of DCT coefficients. The pattern code comprises a plurality of bits, each one of the plurality of bits corresponding to one of the plurality of DCT coefficients. Each one of the plurality of bits is 0 when the DCT coefficient is 0, and is otherwise 1. Nonzero DCT coefficients are identified using the pattern code. Each zero DCT coefficient is encoded with zero bits. A coefficient table is prepared, the coefficient table having a plurality of code pairs, each of the plurality of pairs having a length code and a Huffman code. In addition, a pattern table is prepared, the pattern table having a plurality of code pairs, each of the plurality of pairs having a length code and a Huffman code. A table lookup is performed for each non-zero DCT coefficient within the coefficient table. Similarly, a table lookup is performed for each pattern code within the pattern table. Optimum compression is achieved since a majority of the non-zero coefficients have common values which can be compressed through Huffman encoding.
Therefore, the present invention provides a method and system for vertically scaling the live video signal data and performing a 4:2:2 to 4:2:0 color format conversion simultaneous with the vertical scaling step. Moreover, a one-dimensional bufferless discrete cosine transform is performed on the scaled live video signal data to create a plurality of scaled DCT coefficients. Each of the plurality of the scaled DCT coefficients is then Huffman encoded.