FIG. 1 illustrates a convention video camera recording system 10. Light representing an image impinges on a sensor 12, such as a photo-tube, charge-coupled device or CMOS photo sensor. The sensor 12 converts the light signal to an electrical video signal representing the image. The video signal is received at an effects processor 14 which also receives control signals. The effect processor 14 processes the video signal to produce one or more special effects. Some examples of special effects include, a fade in or fade out, where the brightness of the output video signal is gradually increased from black or reduced to black. Another special effect is a dissolve, where a gradual fade from one scene to another scene is accomplished. Other special effects include wipes and graphics overlays.
Often, the effect to be achieved is a mathematical operation on the pixel values of two different frames. For example, a dissolve may be achieved using the formula F.sub.0 (t)=.alpha..multidot.F.sub.1 (t)-(1-.alpha.(t)).multidot.F.sub.2 (t) where F.sub.0 (t) is an output frame, F.sub.1 (t) is a frame from which the dissolve begins, F.sub.2 (t) is a frame on which the dissolve ends, t is a frame interval, and .alpha.(t) is a monotonically decreasing function having a value that starts at 1 and gradually decreases to 0 at a finite frame interval t&gt;0. To perform the special effect, the effects processor 14 is able to store one or more frames in a memory 16.
The effects processed video signal, i.e., the uncompressed frames as processed (if at all) by the effects processor 14, are outputted to a compressor 18. If necessary, the compressor 18 first digitizes the frames and/or strips the synchronization signals from the frames, such as the horizontal and vertical blanking interval information. The compressor 18 compresses the frames of the effects processed video signal to produce a compressed video signal. A variety of compression techniques can be used to compress the video signal including MPEG-1, MPEG-2, motion JPEG and DVC.
As is well known, MPEG-2 video compression is a combination of spatial and temporal compression to remove spatial and temporal redundancy, respectively, in a sequence of moving pictures (fields or frames). According to MPEG-2, (the luminance data of) a to-be-compressed picture is divided into two-dimensional (e.g., 16.times.16) arrays of pixels called macroblocks. Temporal compression includes the steps of identifying a prediction macroblock for one or more of the macroblocks of the to-be-compressed picture. The prediction macroblock is a macroblock in another picture, called a reference picture, identified by a motion vector. A prediction error macroblock is formed by subtracting the prediction macroblock from the to-be-compressed macroblock. The prediction error macroblock thus formed is then spatially compressed.
In spatial compression, each macroblock is divided into smaller blocks (e.g., 8.times.8 arrays of pixels). The blocks are discrete cosine transformed, quantized, scanned into a sequence of coefficients and entropy encoded. Spatial compression is applied to both prediction error macroblocks and also non-temporally compressed macroblocks, i.e., original macroblocks of the to-be-compressed pictures. Macroblocks of the to-be-compressed picture which are only spatially compressed are referred to as intracoded macroblocks and macroblocks of the to-be-compressed picture which are both spatially and temporally compressed are referred to as interceded macroblocks. The spatially compressed picture data and motion vectors, as well as other control information, is then formatted into a bitstream according to the MPEG-2 syntax. Pictures designated as reference pictures are furthermore decompressed and stored locally so that they can be used for forming predictions.
Compressed pictures are organized into groups of pictures and the groups of pictures are organized into sequences. A group of pictures begins on an I picture or intracoded picture. The macroblocks of I pictures are only spatially compressed. I pictures are used as reference pictures for forming predictions of other pictures. The group of pictures may also have P pictures or forward predicted compressed pictures. Prediction macroblocks for P pictures can be selected from only a preceding reference picture. The P picture may also contain intracoded macroblocks (if adequate prediction macroblocks cannot be found therefor). P pictures are also used as reference pictures for forming predictions of other pictures. Finally, the group of pictures can also have B pictures or bidirectionally predicted compressed pictures. Prediction macroblocks for B pictures can be selected from a preceding reference picture, a succeeding reference picture or from both preceding and succeeding reference pictures (in which case the multiple prediction macroblocks are interpolated to form a single prediction macroblock). A B picture can also have intracoded macroblocks (if an adequate prediction cannot be found therefor).
Compressed frame data formed according to the MPEG-2 standard produces a variable number of bits per picture. The amount of compressed picture data must be carefully controlled relative to the rate of transfer of such information to a decoder and the size of a buffer of the decoder to prevent the decoder buffer from underflowing and, under certain circumstances, to prevent the decoder buffer from overflowing. To that end, a bit budget may be established for each picture which may be adjusted from time to time according to a presumed decoder buffer fullness. Bit budgets can be met by controlling quantization (most notably, the quantization scale factor which controls the coarseness of quantization), amongst other aspects of compression.
Compressed picture data formed according to the MPEG-2 standard may be placed into program elementary stream (PES) packets. PES packets may also be formed for other elementary streams such as a compressed audio stream, a closed captioning text stream, etc., associated with the same video program. In the case of authoring a video program for playback in a controlled (i.e., halt-able) playback environment, such as a CD, DVD, etc., the PES packets may be organized into packs and stored on a storage medium or record carrier. Other navigational information indicating how to read back the packs to construct the video program may also be recorded on the storage medium. In the case of creating a video program in a non-controlled environment, such as for broadcast over a noisy channel where the receiver might not be able to temporarily suspend or halt the flow of data, the data of the PES packets may be inserted into transport packets of fixed length. The transport packets containing elementary stream data for one or more video programs are then multiplexed togther to form a transport stream. Transport packets containing various kinds of program specific information, such a program association table and a program mapping table for identifying the transport packets carrying each elementary stream of a particular video program, may be also multiplexed into the transport stream.
As noted above, in the course of compressing the video signals, one or more reference frames may be stored in a second memory 20. In addition, the memory 20 is typically needed to format the compressed video signal into an appropriate form for output according to the standard syntax of the compression technique used to compress the video signal, i.e., into blocks, macroblocks, slices, pictures, groups of pictures, sequences, etc., including all appropriate header and control information. The compressed video signal thus formed is then read out of the memory 20 and stored on a suitable high storage capacity storage device 22 such as a magnetic tape, magnetic disk, optical disc, etc.
There are a number of problems associated with the video recording system 10. First, the recording system 10 requires two separate memory circuits 16 and 20. The memory circuits 16 and 20 are each typically formed using separate DRAM or SDRAM integrated circuits. As such, the cost of the overall system 10 is high.
Second, compression and effects processing are performed independently. This is disadvantageous because different special effects can impact compression performance but no such indication of the type of special effect performed on the uncompressed frames is provided to the compressor so as to compensate for, or take advantage of, such a priori special effects information. For example, a fade is achieved by a gradual change of pixel luminance intensity over a sequence of frames. Such changes in luminance intensity tend to lessen the ability to find an accurate predictor in a (decompressed) reference frame for a to-be-compressed frame. The same is true for dissolves.
Third, once the special effect has been performed on a video frame, it is impossible to "undo" or reverse the special effect and recover the original frame. Specifically, a fade in or fade out permanently changes the luminance data of the frame. A dissolve permanently changes both the luminance and chrominance data. In a similar fashion, a wipe replaces pixel data of another video signal with pixel data of the post-wipe video signal overlaid thereon.
It is an object of the present invention to overcome the disadvantages of the prior art.