1. Field of the Invention
This invention relates to video signal processing, for example, processing in which data (possibly compressed data) representing two or more video signals are mixed.
2. Description of the Prior Art
It is often desirable to mix, wipe or superimpose two or more video signals. For example, a so-called wipe effect might be used to transition between two different scenes in a television programme, or a so-called logo or other computer-generated signal such as a subtitle or a set of credits might need to be superimposed over a video image without otherwise disrupting the underlying image.
With analogue video signals, or even with uncompressed digital video signals, this operation is relatively straightforward. A key signal can be used to control the level of each of the constituent video signals (say, signals xe2x80x9cAxe2x80x9d and xe2x80x9cBxe2x80x9d) at each pixel position, with the two level-controlled signals then being added together. A basic relationship between the level of the key K signal, the levels A and B of the input pixels and the level of the output pixel at each pixel position might be:
Output pixel value=A(1xe2x88x92K)+BK
This process is carried out for each output pixel. So, if signal A is to be replaced in its entirety by signal B at a particular pixel position, the key signal would be 1 (otherwise expressed as 100%), and if there is to be a 50:50 mix of the two pixels the key value would be 0.5 or 50%.
The situation is much more difficult when either or both inputs is a compressed video stream. In a compressed video stream such as an MPEG-2 video stream, pixels are generally compressed as blocks known as macroblocks, so that it is not possible to derive the value of a particular pixel directly from the compressed video signal.
Compressed video signals are also often subject to an overall limit on the quantity of data that can be used to transmit or store the signal. While there can be some variation from picture to picture, or even from group-of-pictures (GOP) to GOP, the time-averaged data rate is often constrained to the capacity of a transmission or storage channel. This allowable variation from picture to picture or GOP to GOP can mean that two signals to be combined can have the same nominal data rate but very different instantaneous data rates. So, when constructing a composite video signal from a group of video signals including one or more compressed signals, great care is needed to avoid a data overflow or underflow.
A third feature of compressed video signals relevant to this discussion is that they often make use of motion vectors to indicate blocks of temporally preceding or following pictures which are similar to a block of a current picture and so can cut down the amount of data needed to encode the current picture. Where two signals are being combined, however, it is possible that a motion vector for a current picture block can point to an area of a preceding or following image which has been replaced by or mixed with another input signal, so that the motion vector is no longer useful in the compression or decompression of that block.
One way of handling these problems is to decompress the entire compressed input signals, carry out the mixing or similar process in the non-compressed domain, and then re-compress the resulting composite pictures.
This process is limited by the general principle with compression systems such as the MPEG-2 system that each generation of compression tends to reduce the quality of the resulting images. It is undesirable if the simple addition of logo or similar information causes a deterioration in the overall image quality of the pictures to which the logo information is added.
In order to alleviate this drawback, it is desirable to recompress as much as possible of the composite picture using compression parameters (such as a quantisation parameter Q, motion vectors, DCT frame type and so on) which are the same as those used to compress the corresponding block of the relevant input picture. However, this general aim is made more difficult by the following considerations:
(a) how many (or, looked at another way, how few) blocks need to be re-encodedxe2x80x94with newly derived parametersxe2x80x94because their image content is not related sufficiently closely to one of the input images?
(b) what happens if this re-encoding scheme would lead to too much output data being generated for the transmission or storage channel under consideration?
(c) how can it be detected whether the motion vectors, associated with blocks whose parameters are to be re-used, point to areas of the same image in the preceding or following pictures?
The invention aims to address or at least alleviate at least one of these problems.
This invention provides video signal processing apparatus in which at least two input video signals are combined in proportions determined by a pixel key signal to generate an output video signal for compression, at least one of the input video signals each having respective associated compression parameters from a data compression process applied to that video signal;
the apparatus comprising:
means for estimating the quantity of data which will be produced by compression of a current image of the output video signal;
means for comparing the detected quantity of data with a target quantity of data, to determine whether a data overflow is expected;
means, operable in the event that a data overflow is expected, for determining which blocks of the output video signal can be compressed by re-using compression parameters from corresponding blocks of one of the input video signals, the determining means comprising:
(a) means for detecting blocks of the output image derived entirely from a single one of the input video signals, whereby blocks of the output image not derived entirely from a single one of the input video signals are compressed using newly derived compression parameters;
(b) means for detecting an expected amount of data overflow and deriving a threshold value from the expected amount of overflow in accordance with a predetermined algorithm; and
means for detecting whether a quantisation parameter associated with the compression parameters of a block under test, that block being derived entirely from a single one of the input video signals, lies within a proportion representing the least-harshly quantising quantisation parameters across the respective input image, the proportion being defined by the threshold value, blocks for which the quantisation parameter lies within the said proportion being compressed by re-using the compression parameters associated with the corresponding block of the input video.
The invention conveniently and elegantly addresses the problem of handling data overflows by varying the amount of re-coding in dependence on the amount of overflow.
By this technique, otherwise unnecessary recoding (with its consequent loss of image quality) can be avoided.