1. Field of the Invention
This invention relates to video signal processing, for example, processing in which data (possibly compressed data) representing two or more video signals are mixed.
2. Description of the Prior Art
It is often desirable to mix, wipe or superimpose two or more video signals. For example, a so-called wipe effect might be used to transition between two different scenes in a television programme, or a so-called logo or other computer-generated signal such as a subtitle or a set of credits might need to be superimposed over a video image without otherwise disrupting the underlying image.
With analogue video signals, or even with uncompressed digital video signals, this operation is relatively straightforward. A key signal can be used to control the level of each of the constituent video signals (say, signals xe2x80x9cAxe2x80x9d and xe2x80x9cBxe2x80x9d) at each pixel position, with the two level-controlled signals then being added together. A basic relationship between the level of the key K signal, the levels A and B of the input pixels and the level of the output pixel at each pixel position might be:
Output pixel value=A(1xe2x88x92K)+BK 
This process is carried out for each output pixel. So, if signal A is to be replaced in its entirety by signal B at a particular pixel position, the key signal would be 1 (otherwise expressed as 100%), and if there is to be a 50:50 mix of the two pixels the key value would be 0.5 or 50%.
The situation is much more difficult when either or both inputs is a compressed video stream. In a compressed video stream such as an MPEG-2 video stream, pixels are generally compressed as blocks known as macroblocks, so that it is not possible to derive the value of a particular pixel directly from the compressed video signal.
Compressed video signals are also often subject to an overall limit on the quantity of data that can be used to transmit or store the signal. While there can be some variation from picture to picture, or even from group-of-pictures (GOP) to GOP, the time-averaged data rate is often constrained to the capacity of a transmission or storage channel. This allowable variation from picture to picture or GOP to GOP can mean that two signals to be combined can have the same nominal data rate but very different instantaneous data rates. So, when constructing a composite video signal from a group of video signals including one or more compressed signals, great care is needed to avoid a data overflow or underflow.
A third feature of compressed video signals relevant to this discussion is that they often make use of motion vectors to indicate blocks of temporally preceding or following pictures which are similar to a block of a current picture and so can cut down the amount of data needed to encode the current picture. Where two signals are being combined, however, it is possible that a motion vector for a current picture block can point to an area of a preceding or following image which has been replaced by or mixed with another input signal, so that the motion vector is no longer useful in the compression or decompression of that block.
One way of handling these problems is to decompress the entire compressed input signals, carry out the mixing or similar process in the non-compressed domain, and then re-compress the resulting composite pictures.
This process is limited by the general principle with compression systems such as the MPEG-2 system that each generation of compression tends to reduce the quality of the resulting images. It is undesirable if the simple addition of logo or similar information causes a deterioration in the overall image quality of the pictures to which the logo information is added.
In order to alleviate this drawback, it is desirable to recompress as much as possible of the composite picture using compression parameters (such as a quantisation parameter Q, motion vectors, DCT frame type and so on) which are the same as those used to compress the corresponding block of the relevant input picture. However, this general aim is made more difficult by the following considerations:
(a) how many (or, looked at another way, how few) blocks need to be re-encodedxe2x80x94with newly derived parametersxe2x80x94because their image content is not related sufficiently closely to one of the input images?
(b) what happens if this re-encoding scheme would lead to too much output data being generated for the transmission or storage channel under consideration?
(c) how can it be detected whether the motion vectors, associated with blocks whose parameters are to be re-used, point to areas of the same image in the preceding or following pictures?
The invention aims to address or at least alleviate at least one of these problems.
This invention provides video signal processing apparatus in which at least two input video signals are combined in proportions determined by a pixel key signal to generate an output video signal for compression, at least one of the input video signals each having respective associated compression parameters from a data compression process applied to that video signal;
the apparatus comprising:
means for detecting an average pixel key signal value across each of a plurality of blocks of pixels of the output video signal; and
means for comparing the average pixel key signal values with first and second thresholds, the first threshold representing a pixel key signal value corresponding to combination having primarily one of the video signals, and the second threshold representing a pixel key signal value corresponding to combination having primarily the other of the video signals;
in which:
if the average pixel key signal value for a block lies between the first and second thresholds, that block of the output video signal is compressed using newly derived compression parameters; and
if the average pixel key signal value for a block lies outside the range defined by the first and second thresholds, the compression parameters associated with the corresponding block of the input video signals are made available for re-use in compression of that block of the output video signal.
The invention provides an elegantly simple way of determining whether or not the compression parameters associated with an input video signal block may be re-used in the compression of an output video signal block.
A basic decision is made to detect whether a set of compression parameters is not available for re-use, which is whether the corresponding block of the output image is not primarily derived from either or any of the individual input video signals. For example, the test might be, in the exemplary case of two input video signals, whether the block contains between 25% and 75% of each input video signal.
If the block is found to be primarily from one or other input video signal, the compression parameters from the majority constituent input video signal (if indeed available, as it is not a requirement that both or all input video signals have associated compression parameters) can either be re-used directly or can be submitted for further testing to decide whether or not they should be used.