The present invention relates to a method and apparatus for digitally compositing video image data, wherein first image frames are derived from a required foreground image recorded against an unrequired background image such that a compositing or blending process results in said unrequired background being replaced by a new background image.
Techniques for modifying image data after the data has been recorded have been known for some time. Originally, manual xe2x80x9ctouching-upxe2x80x9d operations were performed directly upon cinematographic film and later photographic mattes were produced allowing two or more filmed images to be combined as a composite image, thereby simulating a visual effect which did not actually occur in reality.
Similar techniques have been employed with television and video signals, originally using analog circuitry arranged to process analog television signals, either represented as red green and blue components or as luminance plus chrominance components. When working with video signals, part of the signal may be removed or keyed out at particular times defined by a synchronised key signal or, alternatively, parts of the video signal may be suppressed to black in response to a suppression signal. These keying signals and suppression signals traditionally have been derived from part of the video signal itself, possibly the luminance signal or possibly the chrominance signals. Thus, techniques for generating these signals have become known as luminance keying (luma-keying) and chrominance keying (chroma-keying) respectively.
Recently, and particularly in the realms of broadcast quality post production, video signals have been manipulated as digital representations where image frames are sampled to produce an array of picture elements (pixels) with each pixel representing a color defined by three color components stored as three numerical values. Thus, traditionally, in video applications, eight bits may be allocated for red green and blue color components at each pixel position or, in accordance with alternative processing schemes, similar allocations of bits may be made for luminance plus color difference signals.
Traditionally, scanned cinematographic film has been processed in an RGB environment with digitized television signals being processed in a luminance plus color difference signal environment, usually identified as YUV. General purpose processing environments have also tended towards a preference for RGB signal processing. Since cinematographic film is of higher color definition, typically 12 bits per color component are used, giving rise to 4096 possible colors per color component. Chroma-keying techniques are exploited both in film post-production and television post-production. Image frames for a foreground image may be derived by recording talent against a background of a particular color, with a highly saturated blue or a highly saturated green being particularly preferred. Required portions of the foreground image should not include colors used in the background image during the production process. A subsequent post production compositing process may then be configured to automatically replace the unrequired background image with a new background image. A key signal is generated at regions identified as belonging to the foreground object which is then used to remove the foreground object from its background. Thus, for example, in action movies talent may appear to be acting within a highly dangerous environment where, in reality the action has been recorded in studio conditions against a green or blue screen background. Provided that the post production compositing is highly accurate, it is possible to produce highly realistic illusions which, from a safety point of view, would not be possible to record directly as a real production sequence.
Often, video or film material will have been recorded, for keying purposes, under less than favourable conditions. Under these circumstances, distinguishing a first set of colors from a second set of colors can be particularly difficult. Furthermore, blending edges are required which represent the interface between the foreground object and the new background, where a degree of blending must occur so as to enhance the realism of the effect. If blending of this type does not occur and hard transitions exist on pixel boundaries, visible artefacts will be present within the image and it will be clear to anyone viewing the resulting clip that the two image parts originated from separate sources.
A problem with known systems is that it may be difficult to adjust color volumes so as to ensure that all key colors are within an internal volume and all non-key colors are outside an external volume, with the required blending regions being outside the internal volume, but inside the external volume.
The term xe2x80x9cvideoxe2x80x9d will be used to identify any image signal consisting of a sequence of image frames arranged to create the effect of moving action. This includes true video sources, such as those derived from D1 videotape, in addition to image data derived from other sources, such as cinematographic film. Thus, as used herein, high resolution cinematographic film may be digitised to produce image data which is considered herein as xe2x80x9cvideo dataxe2x80x9d, although not conforming to established video protocols, when used in the narrower sense.
According to a first aspect of the present invention, there is provided a method of processing image data in which each image has a plurality of pixels and each pixel is represented by three color components defining a position within a color-space, comprising: of identifying a base color; calculating a distance in color-space between an input color and said base color; and producing a control value in relation to said calculated distance.
Preferably, the color-space co-ordinates represent positions on an orthogonal set of axes and said distance is calculated from the sum of each component squared. Preferably, said control value is calculated from the square root of said sum. In a preferred embodiment, color-space co-ordinates are transformed onto an alternative set of orthogonal axes. Preferably, the transformation is performed with reference to said base color.
In a preferred embodiment, the base color is determined from a set of manually selected colors and said base color may be derived from said set by forming a convex hull around said selected colors in color-space.
In a preferred embodiment, the control value is used to suppress in areas of color spill. Alternatively, the control value is used to generate a keying signal and said keying signal may include a tolerance region and a softness region.
According to a second aspect of the present invention, there is provided image data processing apparatus including means for defining image pixels representing color components of a color-space, comprising: means for identifying a base color; calculating means for calculating distance in color-space between an input color and said base color; and means for producing a control value in relation to said calculated distance.