1. Field of the Invention
The present invention relates to a method and apparatus for processing compressed data and, for example, a method and apparatus for directly modifying transform coefficients associated with blocks of the compressed data in a transform-domain in consideration of temporal dependencies of the compressed data.
2. Description of the Related Art
As with many of today""s technologies, the current trend in image sequence developing and editing is to use digital formats. Even with motion picture film, editing of image sequences (including image splicing, color processing, and special effects) can be much more precisely accomplished by first converting images to a digital format, and performing desired edits upon the digital format. If desired, images can then be converted back to the original format.
Unfortunately, digital formats usually use enormous amounts of memory and transmission bandwidth. A single image with a resolution of 200xc3x97300 pixels can occupy megabytes of memory. When it is considered that many applications (for example, motion picture film processing) require far greater resolution, and that image sequences can include hundreds or thousands of images, it becomes very apparent that many applications are called upon to handle gigabytes of information, creating a bandwidth problem, in terms of computational and transmission resources.
To solve the bandwidth problem, standards have been proposed for image compression. These standards generally rely upon spatial or temporal redundancies which exist in one or more images.
A single image, for example, may have spatial redundancies in the form of regions having the same color (intensity and hue); a single, all blue image could potentially be represented simply by its intensity and hue, and information indicating that the entire frame has the same characteristics.
Temporal redundancies typically exist in sequences of images, and compression usually exploits these redundancies as well. For example, adjacent images in a sequence can be very much alike; exploiting redundancies, a compressed image sequence may include data on how to reconstruct current image frames based upon previously decoded frames. This data can be expressed as a series of motion vectors and difference information. To obtain this information, pixels in the second frame are grouped into image squares of 8xc3x978 or 16xc3x9716 pixels (xe2x80x9cblocksxe2x80x9d of pixels), and a search is made in a similar location in a prior frame for the closest match. The motion vectors and difference information direct a decoder to reconstruct each image block of the second frame by going back to the first frame, taking a close match of the data (identified by the motion vector) and making some adjustments (identified by the difference information), to completely reconstruct the second frame.
One group of standards currently popular for compression of image sequences has been defined by the Moving Pictures Experts"" Group, and these standards are generally referred to as xe2x80x9cMPEG.xe2x80x9d The MPEG standards generally call for compression of individual images into three different types of compressed image frames: compressed independent (xe2x80x9cIxe2x80x9d) frames exploit only spatial redundancies, and contain all the information necessary to reconstruct a single frame; compressed prediction (xe2x80x9cPxe2x80x9d) frames exploit temporal redundancies from a prior frame (either a P or I frame) and typically only require about ⅓ as much data as an I frame for complete frame reconstruction; and compressed bi-directional interpolated (xe2x80x9cBxe2x80x9d) frames can use data from either or both of prior and future frames (P or I frames) to provide frame reconstruction, and may only require xc2xc as much data as a P frame. Other compression standards also rely upon exploitation of temporal image redundancies, for example, H.261 and H.263.
Compressed data such as video signals are often difficult to manipulate without having to decompress, perform an operation and recompress the data. For example, fade operations have typically been carried out on decompressed video. The fade operation is often used in video broadcasting. One typical example is that television (TV) stations splice in commercial clips during regular TV program broadcasting. An abrupt beginning of a commercial could annoy viewers; a gradual fade to the black of the immediately preceding video is much more preferred. The operation of gradually fading to black is called a xe2x80x9cfade-outxe2x80x9d operation. On the other hand, the operation of gradually fading from black to full or partial picture information is called a xe2x80x9cfade-inxe2x80x9d operation.
In digital TV broadcasting, regular TV programs (live or pre-recorded) are typically stored and transmitted in a compressed form. MPEG-2 is a compressed form used in many digital TV consortia such as HDTV or ATSC. A conventional way of performing fade on MPEG sequence is to decompress the sequence, apply the fading operation and recompress it back. Within this loop, costly DCT and motion estimation operations make it effectively impossible for real time applications. Therefore, a need exists for a fade technique applicable in the compressed domain to avoid these two bottlenecks.
Although it is known to implement operations directly on compressed JPEG data, see Brian Smith and Larry Rowe, xe2x80x9cAlgorithms For Manipulating Compressed Images,xe2x80x9d IEEE Computer Graphics and Applications, pp. 34-42, September 1993, there are problems, particularly with respect to MPEG. For example, since MPEG utilizes interframe coding, frames of a picture may be coded depending on one or two other frames. Also, within these pictures, different coding methods may apply on different types of macroblocks. Thus, a universal scheme for different types of macroblocks in different types of pictures is needed. Additionally, operation on DC coefficients can only change the brightness of the whole DCT block uniformly which may lead to a problem in a fade-out operation when the fade-out approaches black. It would be helpful to have an approximated method based on the consideration that the pictures are almost black when the macroblocks or other data in the fade-out operation approach black. Furthermore, considering the whole process within the MPEG context, the variable quantization used in MPEG may introduce error-accumulation problems. It would be helpful to improve the visual quality of the fade results, such as by a correction process.
The present inventions are directed to methods and apparatus for operating or modifying data, and especially compressed data, without having to decompress the data. They can do so even if there are temporal or spatial dependencies within the data, and even when the data is arranged in a format different than the format in which the data will ultimately be used, such as MPEG video data. In MPEG video, the data is stored in a different order than that in which it will be displayed. As a result, the data can be processed in the order in which it exists, such as in its storage form, rather than in its useful form, e.g. the display order. One particularly advantageous form of the invention is used to produce fade-in and fade-out operations on MPEG video, which is a relatively complicated compression data form. The data is stored as frames of data in one order and displayed as frames of data in another form and in another order. Moreover, the frames of data in the stored format are not complete in and of themselves and depend for their completeness for display purposes on data contained in other frames of stored data. Furthermore, the dependencies on data in other frames apply not only to data in previous frames but also to data in frames displayed subsequently. Several aspects of the present inventions account for these dependencies. However, it can also operate on more simple forms of compressed data.
In an exemplary preferred embodiment, methods and apparatus are provided for a fade operation on MPEG video or other compressed data wherein either one or both of the following occur: (1) the fade or other operation can be done regardless of temporal dependency in the MPEG sequence, and therefore, the whole video can be processed if desired sequentially as it is read in; and (2) the DC manipulation concerns as little as one coefficient in a DCT block and therefore the process is fast and easy. These two advantages allow MPEG streams to be processed in a streamline fashion while avoiding bottleneck operations in DCT and motion estimation.
In accordance with one aspect of the present inventions, an apparatus and method are provided for modifying characteristics of a sequence of data, preferably representing compressed data. Preferably, a sequence of data is received representing compressed data and which includes a selected characteristic to be modified. The compressed data may include information representing motion estimation for purposes of compressing the data. A value is assigned for making at least one modification to the selected characteristic, and the value assigned can vary as a function of the temporal dependencies within the compressed data. In a preferred embodiment, the data can be processed in the sequence in which it is received, and the data can be processed without decompressing the data and without changing the sequence of the data.
In one preferred form of the invention, the apparatus and method operate on data that are grouped in distinct groups or packets, such as frames or video picture frames, and the data in one frame may depend upon the data in one or more other frames. Additionally, the dependence upon data in other frames may include dependence on later frames as well as earlier frames of data. In the context of compressed video data relying upon temporal dependencies and motion estimation, such as MPEG video, the data can still be processed in its storage order while still taking into account the temporal dependencies among and between frames. In one aspect of the present inventions, operations on the data are made as a function of the type of video block involved, whether independent, predicted or bidirectional, and whether the macro block is forward predicted, backward predicted or both. One form of the inventions is particularly suited to fade operations on compressed video data, for example by modifying a DC component of the compressed video.
In accordance with a further specific illustrative embodiment of one aspect of the present inventions, a method of manipulating characteristics of a reproduced sequence of data while the data is in a compressed format includes the steps of: receiving a compressed data sequence; determining sizes of quantized magnitude adjustment steps for blocks of the compressed data sequence depending upon temporal dependencies of the compressed data sequence; and applying the quantized magnitude adjustment steps to a component value of the compressed data stream representing a characteristic of the data for more than one block, without having to decompress the data to apply the modification individually to all of the individual elements of the data. In the example of compressed data in the discrete cosine transform (xe2x80x9cDCTxe2x80x9d) domain, this component value representing a characteristic of the data is the DC coefficient, and the fade_step is applied to the DC coefficient in the compressed data sequence.
Applying the fade_step to the DC coefficient of the compressed block permits modification or changing of the characteristics of all of the pixels in the block without decompressing all of the data to do so.
In a fade application, one or more anomalies may arise, for which adjustments may be desirable. For example, because approximations may carry over from one frame or block to another, anomalies may propagate. Therefore, in a further aspect of the present inventions, the process may further include the steps of determining quantization variations associated with the quantized magnitude adjustment steps and changing subsequent magnitude adjustment step sizes in consideration of the quantization variations.
In another aspect of the present inventions, an apparatus includes means for determining sizes of quantized magnitude adjustment steps for blocks of a compressed data sequence in consideration of temporal dependencies of the compressed data sequence. Means may also be included for applying the quantized magnitude adjustment steps to the compressed data sequence. Quantization variations associated with the quantized magnitude adjustment steps are determined and the subsequent magnitude adjustment step sizes are changed in consideration of the quantization variations. This may be a beneficial way of book keeping and correcting any error that may arise because of quantization variations.
In another aspect of the present inventions, an apparatus operative to receive, process and output compressed data includes: machine readable media; and instructions stored on the machine readable media that instruct a machine to receive blocks of compressed data compliant with a data compression standard, determine sizes of quantized magnitude adjustments steps for the blocks depending upon temporal dependencies of the compressed data, apply the quantized magnitude adjustment steps to the compressed data, determine quantization variations associated with the quantized magnitude adjustment steps, adjust subsequent magnitude adjustment steps in consideration of quantization variations, and provide an output of compressed data compliant with the data compression standard.
In a further aspect of the present inventions, the instructions instruct the machine to repeatedly apply the quantized magnitude adjustment steps to change a characteristic (e.g., intensity) of a reproduced video image derived from the output of compressed data in a uniform manner. For example, the quantized magnitude adjustment steps are applied to fade-out or fade-in the video image.