The present invention relates to image compression. More particularly, this disclosure provides a compression system that re-uses as appropriate compressed data from a compressed input signal when forming a compressed output signal.
Conventional editing or other processing of film or video images is performed in the xe2x80x9cspatialxe2x80x9d domain, that is, upon actual images rather than upon a compressed representation of those images. Since the final product of such editing or processing is frequently an uncompressed signal (such as a typical xe2x80x9cNTSCxe2x80x9d television signal), such editing or processing can sometimes with today""s digital editors and computers be accomplished in real-time. With increasing tendency toward high resolution pictures such as high definition television (xe2x80x9cHDTVxe2x80x9d), however, Internet, cable, television network and other service providers will likely all have to begin directly providing compressed signals as the final product of editing.
A conventional television distribution system 11 is illustrated using FIG. 1, which shows use of a satellite 13, a digital receiving and processing system 15, and a number of individual television subscribers 17. The digital processing system decodes a satellite signal 21 (or alternatively, a compressed, stored signal) and provides a decoded signal 19 to a service provider 23, for distribution via Internet, cable or another broadcasting network 25. Conventionally, the service provider 23 will perform some edits on the decoded signal, such as to mix different signals or feeds together, provide reverse play of an input signal, insert a logo or provide other edits (such as color correction or blue matting). Examples of conventional editing include mixing different camera angles of a live sports event, as well as inserting television commercials into a signal. These and other types of editing are collectively represented by a reference box 27 in FIG. 1, and are also further illustrated in FIG. 2.
In particular, FIG. 2 shows a set of input images 31 which is to be edited to form a set of output images 33. Two hypothetical edits are illustrated, including a first edit which combines five frames 35 of a first image sequence with five frames 37 of a second image sequence to produce the output images 33. A second edit is also represented by a (hypothetical) logo 39 of a local television station xe2x80x9cTF5xe2x80x9d which is to be combined with the input images 31 such that the logo appears in the lower right hand corner of the output images 33. To perform these edits, compressed input data 41 must first be processed by a de-compression engine 43. Following editing, the output images 33 must then also be re-compressed by a compression engine 47 to produce compressed output data 49. Both the compressed input and output data are seen to be in partially compressed MPEG format, with image frames encoded as xe2x80x9cI,xe2x80x9d xe2x80x9cP,xe2x80x9d or xe2x80x9cBxe2x80x9d frames as will be explained below. One of the most time intensive steps in this process is the compression of the output images 33, which will be explained with reference to FIG. 3.
In this regard, compression techniques generally rely on block-based (e.g., tile-based or object-based) encoding, which is introduced with reference to FIG. 3. Well known compression techniques include xe2x80x9cJPEG,xe2x80x9d xe2x80x9cMPEG,xe2x80x9d xe2x80x9cMPEG-2,xe2x80x9d xe2x80x9cMPEG-4,xe2x80x9d xe2x80x9cH.261,xe2x80x9d and xe2x80x9cH.263.xe2x80x9d FIG. 3 shows two image frames 51 and 53. The second image frame 53 of FIG. 3 is divided into a number of square tiles 55, and it is desired to compress the second frame so that relatively less data is used for image representation. In typical image compression, each tile 65 will be separately compressed to remove either spatial redundancies within the same frame (the second frame 53) or temporal redundancies between frames (e.g., by comparison to the first frame 51). In this example, it is to be assumed that the second frame will be compressed only to remove temporal redundancies between different frames, but similar principles can be applied to reduce spatial redundancies within the same frame.
In performing compression, a digital editor or computer compares pixels in each tile in the second frame with image pixels found at or near an expected tile location 61 (e.g., the same position) within the first image frame 51. This comparison is indicated by a reference tile 57 in the second image frame and an arrow 59 which points to the same tile location in the first image frame. The digital processing device sequentially compares pixels from the reference tile 57 with different pixel subsets of a fixed xe2x80x9csearch windowxe2x80x9d 63 to determine a xe2x80x9cclosest match.xe2x80x9d The xe2x80x9cclosest matchxe2x80x9d in FIG. 3 is indicated by a hatched square 65, which is illustrated as slightly offset from position of the tile 61. With the xe2x80x9cclosest matchxe2x80x9d having been found, the digital processing device calculates a motion vector 67 and a set of pixel difference values called xe2x80x9cresiduals.xe2x80x9d
Once all tiles have been placed into motion vector and residual format, the motion vectors and residuals are then encoded in a compact manner, usually through xe2x80x9crun-length coding,xe2x80x9d xe2x80x9cquantizationxe2x80x9d and xe2x80x9cHuffman coding.xe2x80x9d During later de-compression, for example, at a network, television station, editing facility, or at an end-viewer""s computer or television, the second frame is completely re-calculated from an already-de-compressed first frame by re-constructing each tile using motion vectors and residuals. The various standards mentioned above generally operate in this manner, although some new standards call for subdividing images into variable size objects instead of tiles (the principles are, however, similar).
A compressed input signal is conventionally edited in the spatial domain only after an entire group of frames of the input signal have been completely de-compressed. Frames within the de-compressed group can then be edited for color, frame order such as for reverse play or splicing, or frame content (such as logo insertion). Once an edited signal is ready for output, the signal is then usually compressed anew, using a closest match search for each tile of images of the desired output signal. Typically, all images in a sequence, and all portions of all images affected by any edits, are de-compressed prior to editing. Thus, re-compression can be quite time intensive. For example, as much as seventy percent (70%) of resources used by a digital processing device to compress an image are applied to searching for the xe2x80x9cclosest matchesxe2x80x9d and associated motion vectors. Practically speaking, it is extremely difficult to compress these image sequences in real-time. Taking HDTV as an example, compressing many millions of bits per second is difficult even for today""s multi-megahertz computers.
A need exists for a system that can more quickly compress edited signals, particularly those signals which have previously been compressed. Ideally, such a system would operate in a manner compatible with existing object-based and block-based standards, and would operate on spatial regions within an image, e.g., such that it can specially handle logo insertion and the like. Further still, such a system ideally should be implemented in software, so as to improve the speed at which existing machines process video and facilitate applications of real-time editing or compression. The present invention satisfies these needs and provides further, related advantages.
The present invention provides for quicker compression of an edited image by determining whether prior compression information (or data) for certain parts of the unedited image may be re-used. For example, by re-using motion vectors from the un-edited image data, the present invention provides for substantial savings in processor time and resources that would otherwise be occupied with motion search. Certain aspects of the present invention also provide for reduction of quantization errors in re-compressing an edited signal. With increasing use of high definition television (xe2x80x9cHDTVxe2x80x9d) and high resolution formats imminent over the coming years, the present invention facilitates real-time image editing and compression of image signals (such as HDTV or other image signals). The present invention should find ready application to network, professional and home editing of image signals, whether performed by powerful digital editors or on a computer.
One form of the present invention provides a method of editing an input image that is at least partially compressed, using an editing device, an input buffer, a de-compression engine and a compression engine. At least one data block of an input image frame is stored in the buffer in compressed format, such as in bitstream or motion-vector-and-residual format. As would be conventional, the block of data is converted to spatial domain format and editing or processing of the data could include, for example, splicing images together, providing reverse play or otherwise processing an anchor frame such that frame order is affected, editing the input image frame itself to change relative position of the data within frame, changing compression values such as bit rate, or editing pixel values represented by the frame. After editing or processing, the spatial domain data is then compressed in preparation for output, as would be conventional. In the compression process, however, the compression engine uses a table which provides information on whether individual regions within the frame have been edited (or motion vectors and residuals represented by the compressed data for the region have been otherwise affected, such as for example where an anchor frame referenced by the current frame has been edited). If editing or processing of this data has not affected the validity of the compressed input data in the input buffer, then at least some of the compressed data is used as part of the system""s compressed output.
In more detailed aspects of the invention, the compression engine can exactly re-use an original motion vector and associated residuals when a corresponding block of uncompressed data has not been changed in a manner that undermines the validity of the compressed form of the data. A table can be created for each frame in a group of frames, to thereby create a registry of edits, created in association with editing, where each table entry represents a region of a corresponding image frame. In this manner, the system can re-use original motion vectors and residuals except for limited regions of an image frame that have been affected by image editing (e.g., logo insertion, blue matting, etc.), for each image (or DCT) tile within a region. Also, if desired, selective xe2x80x9cslicesxe2x80x9d of compressed input image data only can be decoded from the compressed input for editing, and entries of the table can in this case represent either individual slices or subdivisions of a slice. In even more detailed aspects of the invention, the table entries can be more complex, and if certain types of editing have occurred in the region, then the original motion vector can be used as a motion estimate or raw calculation of motion, and new residuals within a fixed search window can be calculated.
Other forms of the invention provide an improvement in image compression and an apparatus for image compression, which roughly correspond to the principles identified above.
The invention may be better understood by referring to the following detailed description, which should be read in conjunction with the accompanying drawings. The detailed description of a particular preferred embodiment, set out below to enable one to build and use one particular implementation of the invention, is not intended to limit the enumerated claims, but to serve as a particular example thereof.