A television picture is a representation in substantially planar form of a scene that is composed by the producer of a television program. The scene may be composed of tangible objects, or it may be at least partially synthesized by artificial means, e.g. a television graphics system, so that the source of the video signal representing the scene is not a camera or a film scanner but a frame buffer and a computer used for adjusting the contents of the frame buffer. Generally, the scene is made up of two component scenes, namely a foreground scene and a background scene, that are combined using a travelling matte technique. For example, the foreground scene might contain an annulus against a solid color matte and the background scene a square against a screen of contrasting color, as shown in FIGS. 1(a) and 1(b) respectively, so that when the foreground and background scenes are combined the resulting picture has the appearance shown in FIG. 1(c).
A transform system operates on the video signal representing a scene, and may be used to carry out a spatial transformation on the scene. For example, the scene may be displaced to the right. If the foreground video signal representing the FIG. 1(a) scene is applied to a transform system which carries out a transformation on the signal such that the transformed signal represents the scene shown in FIG. 1(d), in which the annulus of the FIG. 1(a) scene has been shifted to the right, then the signal obtained by combining the transformed foreground signal with the background signal might represent the picture shown in FIG. 1(e). Most transform systems are of two main kinds, known as the forward transform system and the reverse transform system. FIG. 2 represents a frame-based reverse transform system based on principles that are known at present. It is believed that the FIG. 2 system does not exist in the prior art, and it is being described in order to provide information that will be useful in understanding the invention.
The transform system shown in FIG. 2 operates by digitizing the input video signal under control of a write clock 10 and writing the resulting sequence of digital words, each having, e.g. ten bits into a video frame buffer 12 using addresses generated by a forward address generator 14. The input video signal is derived from an analog composite video signal in conventional interlaced format by separating it into its components (normally luminance and chrominance) and digitizing each component. The frame buffer 12 therefore comprises a memory for storing the luminance component and a memory for storing the chrominance components. However, since the components are acted on in like manner in the transform system, it is not necessary to consider the components separately. The operation of digitizing the video signal effectively resolves each raster line of the picture into multiple pixels, e.g. 720 pixels, that are small, but finite, in area. The location of a pixel in the scene can be defined by a two-coordinate display address (U, V) of the input screen (FIG. 1(a), e.g.). The address space of the video frame buffer is organized so that there is a one-to-one correspondence between the display addresses and the memory addresses generated by the forward address generator 14. Thus, the digital word representing the pixel having the input scene display address (U, V) is written into the frame buffer 12 at a location that has a memory address that can be expressed as (U, V). The frame buffer has three field memories, one of which is written to and the other two of which are read from. The frame buffer is able to store three video fields each containing about 242 active lines in the NTSC system.
In order to read an output video signal from the frame buffer 12, a read address counter 16 operates under control of a read clock 17 to generate a sequence of output scene display addresses (X, Y) defining the locations in the output screen (FIG. 1(d)) of the pixels that will be successively addressed. The coordinate values X and Y each have the same number of significant digits as the coordinate values U and V respectively. Accordingly, the display addresses (X, Y) define the same possible pixel positions in the output display space as are defined in the input display space by the display addresses (U, V). However, the display addresses (X, Y) are not used directly to read the output video signal from the frame buffer. A reverse address generator 18 receives the output scene display addresses (X, Y) and multiplies them by a transform matrix T' to generate corresponding memory addresses (X', Y') which are used to read the video signal from the frame buffer. The transform matrix T' is applied to the reverse address generator 18 by a user interface 19, and defines the nature of the transform that is effected by the reverse transform system. If, for example, it is desired to effect a transformation in which the input scene is displaced diagonally upwards and to the left by an amount equal to the inter-pixel pitch in the diagonal direction, the transform matrix would be such that the memory address (X', Y') that is generated in response to the display address (X, Y) would be (X+1, Y+1), assuming that the origin of the coordinate system is in the upper left corner of the input and output scene, and values of X and Y increase to the right and downwards respectively. In the general case, it is not sufficient for the values of X' and Y'0 to be related to X and Y by addition or subtraction of integers, and therefore the memory address coordinates X' and Y' have more significant digits than the display address coordinates X and Y. The reverse addresses are applied not only to the frame buffer 12 but also to a video interpolator 20. For each reverse address (X', Y'), the frame buffer outputs the respective digital words representing an array of pixels surrounding the point defined by the reverse address (X', Y'). For example, the data words representing the four pixels nearest the point defined by the address (X', Y') might be provided. These four data words are applied to the interpolator 20, and the interpolator combines these four digital words into a single digital output word based on the fractional portion of the address (X', Y'). For example, using decimal notation, if the least significant digit of each coordinate X and Y is unity but the least significant digit of the coordinates X' and Y' is one-tenth, and the counter 16 generates the read address (23, 6) which is converted to a reverse address (56.3, 19.8) by being multiplied by the transform matrix T', the frame buffer 12 might respond to the reverse address (56.3, 19.8) by providing the digital words stored at the addresses (56, 19), (56, 20), (57, 19) and (57, 20). The interpolator 20 combines these four words into a single digital output word by weighting them 3:7 in the horizontal direction and 8:2 in the vertical direction. This digital word defines the value that is to be generated at the location of the output screen that is defined by the display address (23, 6).
The range of possible reverse addresses is greater than the range of memory addresses defining locations in the frame buffer 12, so that a validly-generated reverse address might define a location that does not exist in the frame buffer's address space. Therefore, the reverse addresses are also applied to an address limit detector 22 which responds to an invalid reverse address (an address which defines a location outside the address space of the frame buffer 12) by providing a signal which causes a video blanker 24 to inhibit the output signal of the frame buffer.
In parallel with the video channel comprising the video frame buffer 12, the video interpolator 20 and the video blanker 24 is a key channel comprising a key frame buffer 26, a key interpolator 28 and a key blanker 30. A key signal that is applied to the key channel provides opacity information about the foreground video signal applied to the video channel. This opacity information defines where and the extent to which a background scene represented by a background video signal can be seen in a composite picture (FIG. 1(c)) formed by mixing the foreground and background video signals under the influence of the key signal. Outside the boundaries of the foreground objects, the foreground scene is transparent (key=0) and the background scene is seen without modification by the foreground scene. If a foreground object is fully opaque key=1), the background scene is fully obscured by the foreground object, but if a foreground object is only partially transparent (0&lt;key &lt;1) the background video signal is mixed with the foreground video signal in proportion to the value of the key. Because the foreground scene is transformed by the video channel, it is necessary to transform the key in the identical manner in order to maintain congruence between the foreground scene and the key. Therefore, the key signal is processed in the key channel in the same way as the foreground signal is processed in the video channel. Thus, the key signal undergoes the same spatial transformation and interpolation as the foreground signal, and is subject to the same address limit blanking.
The transform matrix T' must be the mathematical inverse of the desired spatial transform T, and it is for this reason that the reverse transform system is known as such.
When a television picture is composed of a foreground scene and a background scene, special effects are often used to make the picture appear more realistic, i.e. so that it does not look as if it had been composed of two (or more) separate scenes. Among the possible effects are shadow effects. In the FIG. 3(a) picture, the foreground scene is a vertical column 40 and the background scene is a vertical surface 43 of uniform -uminance behind the column. Because the FIG. 3(a) picture is composed of two separate scenes, the column 40 does not cast a shadow on the vertical surface. However, the video of the background scene can be selectively reduced in order to simulate the appearance of a shadow 42 (FIG. 3(b)). This can be accomplished using a conventional digital video effects device.
A digital video effects device may also be used to effect a spatial transformation of the combined foreground (object plus shadow) scene. The desired background scene is first recorded on videotape. A shadow is simulated by applying a key signal that defines the area of the background scene that is to be obscured by the foreground object to the key input of the effects device and a full field shadow matte signal, typically representing the color black, to the video input of the effects device. The output signal of the effects device is a shadow signal representing a black area of the same size and shape as the foreground object but spatially transformed in accordance with the desired transformation of the foreground object and also offset to simulate projection into the plane of the background scene. The shadow is recorded over the background scene. Then, the foreground signal is applied to the video input of the effects device in lieu of the shadow matte signal, and the foreground scene is spatially transformed in the same manner as the shadow but is not additionally offset. The transformed foreground scene is recorded over the combined background scene and shadow. Alternatively, if two video effects devices are available, the shadow matte signal and the foreground signal are applied to the respective video inputs, and the key signal is applied to the key inputs of both effects devices. The foreground scene is transformed in one effects device and the transformed and offset shadow is generated by the other effects device, and the output signals of the effects devices are combined and recorded over the background scene.
The video effects devices that are available at present do not enable a signal representing the shadow of a spatially transformed object to be generated without carrying out two processing operations in a video effects device. If two effects devices are available, the two processing operations may be performed concurrently, but nevertheless two processing operations must be performed.